Deep contextualized word representations

@article{Peters2018DeepCW,
  title={Deep contextualized word representations},
  author={Matthew E. Peters and Mark Neumann and Mohit Iyyer and Matt Gardner and Christopher Clark and Kenton Lee and Luke Zettlemoyer},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.05365}
}
  • Matthew E. Peters, Mark Neumann, +4 authors Luke Zettlemoyer
  • Published 2018
  • Computer Science
  • ArXiv
  • We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy. [...] Key Result We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.Expand Abstract
    Retrofitting Contextualized Word Embeddings with Paraphrases
    5
    Dissecting Contextual Word Embeddings: Architecture and Representation
    135
    Deep contextualized word embeddings from character language models for neural sequence labeling
    1
    Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations
    8
    Contextualized Word Representations for Self-Attention Network
    Contextual String Embeddings for Sequence Labeling
    353
    Quantifying the Contextualization of Word Representations with Semantic Class Probing
    Linguistic Knowledge and Transferability of Contextual Representations
    147

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 65 REFERENCES
    Learned in Translation: Contextualized Word Vectors
    479
    Semi-supervised sequence tagging with bidirectional language models
    308
    context2vec: Learning Generic Context Embedding with Bidirectional LSTM
    246
    Word Representations: A Simple and General Method for Semi-Supervised Learning
    1940
    Embeddings for Word Sense Disambiguation: An Evaluation Study
    177
    Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
    486
    Character-Aware Neural Language Models
    1151
    Neural Sequence Learning Models for Word Sense Disambiguation
    98
    Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
    3726
    Enriching Word Vectors with Subword Information
    3511