The Wayback Machine - https://web.archive.org/web/20200709174529/https://github.com/topics/word2vec
Skip to content
#

word2vec

Here are 1,210 public repositories matching this topic...

gensim
dmyersturnbull
dmyersturnbull commented May 1, 2020

This is an awesome library, thanks @ddbourgin!!

Users might not know the best way to install this package and try it out. (I didn't, so I eventually just copied the source files.)
Neither the readme nor readthedocs have install instructions.

I couldn't find it on PyPi or Anaconda, and there doesn't appear to be a pyproject.toml, setup.cfg, setup.py, or conda recipe.

Moreover, the t

Luke-in-the-sky
Luke-in-the-sky commented Jun 3, 2018

Hi there,

I think there might be a mistake in the documentation. The Understanding Scaled F-Score section says

The F-Score of these two values is defined as:

$$ \mathcal{F}_\beta(\mbox{prec}, \mbox{freq}) = (1 + \beta^2) \frac{\mbox{prec} \cdot \mbox{freq}}{\beta^2 \cdot \mbox{prec} + \mbox{freq}}. $$

$\beta \in \mathcal{R}^+$ is a scaling factor where frequency is favored if $\beta

基于Pytorch和torchtext的自然语言处理深度学习框架,包含序列标注、文本分类、句子关系、文本生成、结构分析、五大功能模块,已实现了命名实体识别、中文分词、词性标注、语义角色标注、情感分析、关系抽取、语言模型、文本相似度、文本蕴含、依存句法分析、词向量训练、聊天机器人、机器翻译、文本摘要等功能。框架功能丰富,开箱可用,极易上手!基本都是学习他人实现然后自己修改融合到框架中,没有细致调参,且有不少Bug~

  • Updated Jan 10, 2020
  • Python
Happy-zyy
Happy-zyy commented Nov 13, 2018

大佬您好,我参考了您得assignment1中得word2vec.py得实现。但是在运行过程中梯度检测报错了。
==== Gradient check for skip-gram ====
Gradient check failed.
First gradient error found at index (0, 0)
Your gradient: -0.087147 Numerical gradient: 1254.567123
我是用py3实现的,之前的所有代码几乎一致,也都正确通过了,唯独这里通过不了。之后我将您的代码直接全部拷贝下来运行,同样报以上错误,请问您知道怎么回事嘛,您当时运行通过了吗?

JiaWenqi
JiaWenqi commented Mar 13, 2019

def get_all_words(self): """ Return all words tokenized, in lowercase and without punctuation """ return [w.lower() for w in word_tokenize(self.text) if w not in string.punctuation]
I found that in this function, only punctuation of the text was removed. But there are other types of words that have not been removed.
eg:
`from nltk.corpus import stopwords

word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding

  • Updated Jun 1, 2020
  • OpenEdge ABL

Improve this page

Add a description, image, and links to the word2vec topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the word2vec topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.