Enriching Word Vectors with Subword Information

@article{Bojanowski2017EnrichingWV,
  title={Enriching Word Vectors with Subword Information},
  author={P. Bojanowski and E. Grave and Armand Joulin and Tomas Mikolov},
  journal={Transactions of the Association for Computational Linguistics},
  year={2017},
  volume={5},
  pages={135-146}
}
  • P. Bojanowski, E. Grave, +1 author Tomas Mikolov
  • Published 2017
  • Computer Science
  • Transactions of the Association for Computational Linguistics
  • Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. [...] Key Method A vector representation is associated to each character n-gram; words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows us to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word…Expand Abstract
    4,320 Citations
    Towards Learning Word Representation
    • PDF
    Learning to Generate Word Representations using Subword Information
    • 13
    • Highly Influenced
    Learning Word Vectors for 157 Languages
    • 542
    • PDF
    Morphological Skip-Gram: Using morphological knowledge to improve word representation
    • PDF
    An Adaptive Wordpiece Language Model for Learning Chinese Word Embeddings
    • 1
    Measuring Enrichment Of Word Embeddings With Subword And Dictionary Information
    • Highly Influenced
    Robust Representation Learning for Low Resource Languages
    • Highly Influenced
    • PDF
    IMPROVING WORD MEANING REPRESENTATIONS USING WIKIPEDIA CATEGORIES
    • PDF
    Probabilistic FastText for Multi-Sense Word Embeddings
    • 42
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 56 REFERENCES
    Better Word Representations with Recursive Neural Networks for Morphology
    • 703
    • Highly Influential
    • PDF
    Co-learning of Word Representations and Morpheme Representations
    • 88
    • PDF
    Joint Learning of Character and Word Embeddings
    • 199
    • PDF
    Efficient Estimation of Word Representations in Vector Space
    • 16,407
    • PDF
    KNET: A General Framework for Learning Word Embedding Using Morphological Knowledge
    • 13
    • PDF
    Distributed Representations of Words and Phrases and their Compositionality
    • 20,426
    • PDF
    Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
    • 514
    • PDF
    Word Embeddings Go to Italy: A Comparison of Models and Training Datasets
    • 37
    • PDF
    Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
    • 304
    • Highly Influential
    • PDF