GloVe: Global Vectors for Word Representation

@inproceedings{Pennington2014GloVeGV,
  title={GloVe: Global Vectors for Word Representation},
  author={Jeffrey Pennington and Richard Socher and Christopher D. Manning},
  booktitle={EMNLP},
  year={2014}
}
Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. [...] Key Method Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with…Expand
Rehabilitation of Count-Based Models for Word Vector Representations
TLDR
A systematic study of the use of the Hellinger distance to extract semantic representations from the word co-occurrence statistics of large text corpora shows that this distance gives good performance on word similarity and analogy tasks, with a proper type and size of context, and a dimensionality reduction based on a stochastic low-rank approximation. Expand
Modeling Semantic Relatedness using Global Relation Vectors
TLDR
A novel method which directly learns relation vectors from co-occurrence statistics is introduced, and it is shown how relation vectors can be naturally embedded into the resulting vector space. Expand
Measuring Enrichment Of Word Embeddings With Subword And Dictionary Information
TLDR
Results show that fine-tuning the vectors with semantic information dramatically improves performance inword similarity; conversely, enriching word vectors with subword information increases performance in word analogy tasks, with the hybrid approach finding a solid middle ground. Expand
Modeling Context Words as Regions: An Ordinal Regression Approach to Word Embedding
TLDR
The underlying ranking interpretation of word contexts is sufficient to match, and sometimes outperform, the performance of popular methods such as Skip-gram, and by using a quadratic kernel, the model can effectively learn word regions, which outperform existing unsupervised models for the task of hypernym detection. Expand
Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings
TLDR
A framework for decomposing word embeddings into smaller meaningful units which are called sub-vectors is presented, which opens up a wide range of possibilities analyzing phenomena in vector space semantics, as well as solving concrete NLP problems. Expand
Word2Box: Learning Word Representation Using Box Embeddings
TLDR
This model takes a region-based approach to the problem of word representation, representing words as n-dimensional rectangles, and provides additional geometric operations such as intersection and containment which allow them to model co-occurrence patterns vectors struggle with. Expand
Fast PMI-Based Word Embedding with Efficient Use of Unobserved Patterns
TLDR
A new word embedding algorithm that works on a smoothed Positive Pointwise Mutual Information (PPMI) matrix which is obtained from the word-word co-occurrence counts and a kernel similarity measure for the latent space that can effectively calculate the similarities in high dimensions is proposed. Expand
PAWE: Polysemy Aware Word Embeddings
TLDR
This work develops a new word embedding model that can accurately represent such words by automatically learning multiple representations for each word, whilst remaining computationally efficient. Expand
Distributed Representation of Words in Vector Space for Kannada Language
TLDR
A distributed representation for Kannada words is proposed using an optimal neural network model and combining various known techniques to improve the vector space representation. Expand
Learning Word Vectors with Linear Constraints: A Matrix Factorization Approach
TLDR
Two new embedding models based on the singular value decomposition of lexical co-occurrences of words are proposed, which allow for injecting linear constraints when performing the decomposition, with which the desired semantic and syntactic information will be maintained in word vectors. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 40 REFERENCES
Linguistic Regularities in Continuous Space Word Representations
TLDR
The vector-space word representations that are implicitly learned by the input-layer weights are found to be surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset. Expand
Linguistic Regularities in Sparse and Explicit Word Representations
TLDR
It is demonstrated that analogy recovery is not restricted to neural word embeddings, and that a similar amount of relational similarities can be recovered from traditional distributional word representations. Expand
Better Word Representations with Recursive Neural Networks for Morphology
TLDR
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes. Expand
Efficient Estimation of Word Representations in Vector Space
TLDR
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities. Expand
Improving Word Representations via Global Context and Multiple Word Prototypes
TLDR
A new neural network architecture is presented which learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and accounts for homonymy and polysemy by learning multiple embedDings per word. Expand
Distributed Representations of Words and Phrases and their Compositionality
TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling. Expand
Learning word embeddings efficiently with noise-contrastive estimation
TLDR
This work proposes a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation, and achieves results comparable to the best ones reported, using four times less data and more than an order of magnitude less computing time. Expand
Word Representations: A Simple and General Method for Semi-Supervised Learning
TLDR
This work evaluates Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeds of words on both NER and chunking, and finds that each of the three word representations improves the accuracy of these baselines. Expand
A Neural Probabilistic Language Model
TLDR
This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Expand
Word Embeddings through Hellinger PCA
TLDR
This work proposes to drastically simplify the word embeddings computation through a Hellinger PCA of the word co- occurence matrix and shows that it can provide an easy way to adaptembeddings to specific tasks. Expand
...
1
2
3
4
...