Deep contextualized word representations

@article{Peters2018DeepCW,
  title={Deep contextualized word representations},
  author={Matthew E. Peters and Mark Neumann and Mohit Iyyer and Matt Gardner and Christopher Clark and Kenton Lee and Luke Zettlemoyer},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.05365}
}
  • Matthew E. Peters, Mark Neumann, +4 authors Luke Zettlemoyer
  • Published 2018
  • Computer Science
  • ArXiv
  • We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy. [...] Key Result We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.Expand Abstract
    Retrofitting Contextualized Word Embeddings with Paraphrases
    • 5
    • Highly Influenced
    • Open Access
    Deep contextualized word embeddings from character language models for neural sequence labeling
    • 1
    • Highly Influenced
    Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations
    • 10
    • Highly Influenced
    • Open Access
    Dissecting Contextual Word Embeddings: Architecture and Representation
    • 149
    • Open Access
    Contextualized Word Representations for Self-Attention Network
    Contextual String Embeddings for Sequence Labeling
    • 375
    • Highly Influenced
    • Open Access
    Linguistic Knowledge and Transferability of Contextual Representations
    • 171
    • Open Access
    LEARN FROM CONTEXT ? P ROBING FOR SENTENCE STRUCTURE IN CONTEXTUALIZED WORD REPRESENTATIONS

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 65 REFERENCES
    Learned in Translation: Contextualized Word Vectors
    • 490
    • Highly Influential
    • Open Access
    Semi-supervised sequence tagging with bidirectional language models
    • 316
    • Open Access
    context2vec: Learning Generic Context Embedding with Bidirectional LSTM
    • 253
    • Highly Influential
    • Open Access
    Word Representations: A Simple and General Method for Semi-Supervised Learning
    • 1,946
    • Open Access
    Embeddings for Word Sense Disambiguation: An Evaluation Study
    • 177
    • Open Access
    Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
    • 490
    • Open Access
    Character-Aware Neural Language Models
    • 1,170
    • Open Access
    Neural Sequence Learning Models for Word Sense Disambiguation
    • 98
    • Open Access
    Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
    • 3,790
    • Open Access
    Enriching Word Vectors with Subword Information
    • 3,603
    • Open Access