Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space

@article{Neelakantan2014EfficientNE,
  title={Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space},
  author={Arvind Neelakantan and Jeevan Shankar and Alexandre Passos and A. McCallum},
  journal={ArXiv},
  year={2014},
  volume={abs/1504.06654}
}
There is rising interest in vector-space word embeddings and their use in NLP, especially given recent methods for their fast estimation at very large scale. [...] Key Method It differs from recent related work by jointly performing word sense discrimination and embedding learning, by non-parametrically estimating the number of senses per word type, and by its efficiency and scalability. We present new state-of-the-art results in the word similarity in context task and demonstrate its scalability by training…Expand
Multi Sense Embeddings from Topic Models
TLDR
This work proposes a topic modeling based skip-gram approach for learning multi-prototype word embeddings and introduces a method to prune the embedDings determined by the probabilistic representation of the word in each topic. Expand
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
TLDR
A general architecture to learn the word and topic embeddings efficiently is presented, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. Expand
Different Contexts Lead to Different Word Embeddings
TLDR
Experimental results show that the word representations learned by the proposed CBOW model outperform the competitive baselines and improve the quality of embeddings but also makes embeds suitable for polysemy. Expand
PAWE: Polysemy Aware Word Embeddings
TLDR
This work develops a new word embedding model that can accurately represent such words by automatically learning multiple representations for each word, whilst remaining computationally efficient. Expand
Improving Distributed Representation of Word Sense via WordNet Gloss Composition and Context Clustering
TLDR
The learned represen-tations outperform the publicly available embeddings on 2 out of 4 metrics in the word similarity task, and 6 out of 13 sub tasks in the analogical reasoning task. Expand
Embedding Words and Senses Together via Joint Knowledge-Enhanced Training
TLDR
This work proposes a new model which learns word and sense embeddings jointly and exploits large corpora and knowledge from semantic networks in order to produce a unified vector space of word and senses. Expand
Embeddings in Natural Language Processing
TLDR
This tutorial will provide a high-level synthesis of the main embedding techniques in NLP, in the broad sense, and start by conventional word embeddings and then move to other types of embedDings, such as sense-specific and graph alternatives. Expand
Learning class-specific word embeddings
TLDR
This work proposes to use class information to enhance the discriminativeness of words and presents a general framework consisting of a pair of convolutional neural networks to utilize the learned class-specific word embeddings as input for text classification tasks. Expand
Contextualized Word Representations for Multi-Sense Embedding
TLDR
Methods to generate multiple word representations for each word based on dependency structure relations are proposed that significantly outperformed state-of-the-art methods for multi-sense embeddings and show that the data sparseness problem is resolved due to the pre-training. Expand
Gaussian Mixture Embeddings for Multiple Word Prototypes
TLDR
This paper proposes the Gaussian mixture skip-gram model, and proposes the Dynamic GMSG (D-GMSG) model, a model that can be regarded as a gaussian mixture distribution in the embedded space, and each gaussian component represents a word sense. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 33 REFERENCES
Multi-View Learning of Word Embeddings via CCA
TLDR
Low Rank Multi-View Learning (LR-MVL) is extremely fast, gives guaranteed convergence to a global optimum, is theoretically elegant, and achieves state-of-the-art performance on named entity recognition (NER) and chunking problems. Expand
Lexicon Infused Phrase Embeddings for Named Entity Resolution
TLDR
A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented. Expand
Word Representations: A Simple and General Method for Semi-Supervised Learning
TLDR
This work evaluates Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeds of words on both NER and chunking, and finds that each of the three word representations improves the accuracy of these baselines. Expand
Tailoring Continuous Word Representations for Dependency Parsing
TLDR
It is found that all embeddings yield significant parsing gains, including some recent ones that can be trained in a fraction of the time of others, suggesting their complementarity. Expand
Improving Word Representations via Global Context and Multiple Word Prototypes
TLDR
A new neural network architecture is presented which learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and accounts for homonymy and polysemy by learning multiple embedDings per word. Expand
Efficient Estimation of Word Representations in Vector Space
TLDR
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities. Expand
Distributed Representations of Words and Phrases and their Compositionality
TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling. Expand
Distributed Representations of Sentences and Documents
TLDR
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models. Expand
Learning Word Vectors for Sentiment Analysis
TLDR
This work presents a model that uses a mix of unsupervised and supervised techniques to learn word vectors capturing semantic term--document information as well as rich sentiment content, and finds it out-performs several previously introduced methods for sentiment classification. Expand
A Neural Probabilistic Language Model
TLDR
This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Expand
...
1
2
3
4
...