Corpus ID: 2407601

Distributed Representations of Sentences and Documents

@article{Le2014DistributedRO,
  title={Distributed Representations of Sentences and Documents},
  author={Quoc V. Le and Tomas Mikolov},
  journal={ArXiv},
  year={2014},
  volume={abs/1405.4053}
}
  • Quoc V. Le, Tomas Mikolov
  • Published 2014
  • Computer Science
  • ArXiv
  • Many machine learning algorithms require the input to be represented as a fixed-length feature vector. [...] Key Method Our algorithm represents each document by a dense vector which is trained to predict words in the document. Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models. Empirical results show that Paragraph Vectors outperforms bag-of-words models as well as other techniques for text representations. Finally, we achieve new state-of-the-art results on…Expand Abstract
    5,611 Citations

    Figures, Tables, and Topics from this paper

    Supervised Paragraph Vector: Distributed Representations of Words, Documents and Class Labels
    • 4
    • Highly Influenced
    • PDF
    Spherical Paragraph Model
    • 1
    • Highly Influenced
    • PDF
    Learning Sentence Representation with Context
    • 2017
    • Highly Influenced
    • PDF
    Learning bag-of-embedded-words representations for textual information retrieval
    • 13
    • Highly Influenced
    • PDF
    Deep Learning for Document Representation
    • 1
    Top2Vec: Distributed Representations of Topics
    • 2
    • Highly Influenced
    • PDF
    Learning Sentence Vector Representations to Summarize
    • PDF
    Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content
    • 75
    • Highly Influenced
    • PDF
    Text Categorization by Weighted Features
    • Highly Influenced
    Regularized and Retrofitted models for Learning Sentence Representation with Context
    • 5
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 48 REFERENCES
    Distributed Representations of Words and Phrases and their Compositionality
    • 20,383
    • PDF
    A Neural Probabilistic Language Model
    • 4,653
    • PDF
    Neural Probabilistic Language Models
    • 523
    • Highly Influential
    A Neural Autoregressive Topic Model
    • 177
    • PDF
    Efficient Estimation of Word Representations in Vector Space
    • 16,369
    • PDF
    Linguistic Regularities in Continuous Space Word Representations
    • 2,626
    • PDF
    Improving Word Representations via Global Context and Multiple Word Prototypes
    • 1,081
    • PDF
    Word Representations: A Simple and General Method for Semi-Supervised Learning
    • 1,997
    • Highly Influential
    • PDF
    Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
    • 4,101
    • Highly Influential
    • PDF