Deep contextualized word representations

@article{Peters2018DeepCW,
  title={Deep contextualized word representations},
  author={Matthew E. Peters and Mark Neumann and Mohit Iyyer and Matt Gardner and Christopher Clark and Kenton Lee and Luke Zettlemoyer},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.05365}
}
  • Matthew E. Peters, Mark Neumann, +4 authors Luke Zettlemoyer
  • Published 2018
  • Computer Science
  • ArXiv
  • We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy. [...] Key Result We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.Expand Abstract
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    • 9,786
    • Open Access
    Contextual String Embeddings for Sequence Labeling
    • 375
    • Highly Influenced
    • Open Access
    Dissecting Contextual Word Embeddings: Architecture and Representation
    • 149
    • Open Access
    Linguistic Knowledge and Transferability of Contextual Representations
    • 171
    • Open Access
    What do you learn from context? Probing for sentence structure in contextualized word representations
    • 203
    • Highly Influenced
    • Open Access
    Language Models are Unsupervised Multitask Learners
    • 1,724
    • Open Access
    Improving Language Understanding by Generative Pre-Training
    • 1,400
    • Highly Influenced
    • Open Access
    Pooled Contextualized Embeddings for Named Entity Recognition
    • 79
    • Open Access
    Universal Language Model Fine-tuning for Text Classification
    • 1,098
    • Highly Influenced
    • Open Access

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 65 REFERENCES
    Glove: Global Vectors for Word Representation
    • 14,475
    • Highly Influential
    • Open Access
    Learned in Translation: Contextualized Word Vectors
    • 490
    • Highly Influential
    • Open Access
    Word Representations: A Simple and General Method for Semi-Supervised Learning
    • 1,946
    • Open Access
    Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
    • 3,790
    • Open Access
    Enriching Word Vectors with Subword Information
    • 3,603
    • Open Access
    Distributed Representations of Words and Phrases and their Compositionality
    • 18,700
    • Open Access
    Semi-supervised sequence tagging with bidirectional language models
    • 316
    • Open Access
    Character-Aware Neural Language Models
    • 1,170
    • Open Access
    Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
    • 490
    • Open Access
    Natural Language Processing (Almost) from Scratch
    • 5,429
    • Open Access