Corpus ID: 11332377

Polyglot: Distributed Word Representations for Multilingual NLP

@article{AlRfou2013PolyglotDW,
  title={Polyglot: Distributed Word Representations for Multilingual NLP},
  author={Rami Al-Rfou and Bryan Perozzi and S. Skiena},
  journal={ArXiv},
  year={2013},
  volume={abs/1307.1662}
}
  • Rami Al-Rfou, Bryan Perozzi, S. Skiena
  • Published 2013
  • Computer Science
  • ArXiv
  • Distributed word representations (word embeddings) have recently contributed to competitive performance in language modeling and several NLP tasks. In this work, we train word embeddings for more than 100 languages using their corresponding Wikipedias. We quantitatively demonstrate the utility of our word embeddings by using them as the sole features for training a part of speech tagger for a subset of these languages. We find their performance to be competitive with near state-of-art methods… CONTINUE READING
    392 Citations
    Learning Word Vectors for 157 Languages
    • 481
    • PDF
    On Learning Word Embeddings From Linguistically Augmented Text Corpora
    • 2
    • PDF
    Tailoring Continuous Word Representations for Dependency Parsing
    • 272
    • PDF
    Mimicking Word Embeddings using Subword RNNs
    • 85
    • PDF
    A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
    • 9
    • PDF
    Finely Tuned, 2 Billion Token Based Word Embeddings for Portuguese
    • 2
    • PDF
    Generalizing Word Embeddings using Bag of Subwords
    • 22
    • PDF
    Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
    • Gábor Berend
    • Computer Science
    • Transactions of the Association for Computational Linguistics
    • 2017
    • 14
    • Highly Influenced
    • PDF
    Word Embeddings in Low Resource Gujarati Language
    • 2
    Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource
    • 24
    • PDF

    References

    SHOWING 1-10 OF 49 REFERENCES
    Inducing Crosslingual Distributed Representations of Words
    • 324
    • PDF
    The Expressive Power of Word Embeddings
    • 79
    • PDF
    BabelNet: Building a Very Large Multilingual Semantic Network
    • 459
    • PDF
    A unified architecture for natural language processing: deep neural networks with multitask learning
    • 4,111
    • PDF
    The CoNLL 2007 Shared Task on Dependency Parsing
    • 737
    • PDF
    Natural Language Processing (Almost) from Scratch
    • 5,631
    • Highly Influential
    • PDF
    Simple Semi-supervised Dependency Parsing
    • 489
    • PDF