Deep contextualized word representations

@inproceedings{Peters2018DeepCW,
  title={Deep contextualized word representations},
  author={Matthew E. Peters and Mark Neumann and Mohit Iyyer and Matt Gardner and Christopher Clark and Kenton Lee and Luke Zettlemoyer},
  booktitle={NAACL-HLT},
  year={2018}
}
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy. [...] Key Result We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.Expand
5,437 Citations
Deep contextualized word embeddings from character language models for neural sequence labeling
  • 1
  • Highly Influenced
Dissecting Contextual Word Embeddings: Architecture and Representation
  • 187
  • PDF
Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations
  • 22
  • Highly Influenced
  • PDF
Retrofitting Contextualized Word Embeddings with Paraphrases
  • 9
  • Highly Influenced
  • PDF
Quantifying the Contextualization of Word Representations with Semantic Class Probing
  • 2
  • Highly Influenced
  • PDF
Quantifying the Contextualization of Word Representations with Semantic Class Probing
  • 1
Contextual String Embeddings for Sequence Labeling
  • 570
  • Highly Influenced
  • PDF
Contextualized Word Representations for Self-Attention Network
Context Analysis for Pre-trained Masked Language Models
  • Highly Influenced
  • PDF
Linguistic Knowledge and Transferability of Contextual Representations
  • 254
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 65 REFERENCES
Learned in Translation: Contextualized Word Vectors
  • 565
  • Highly Influential
  • PDF
Semi-supervised sequence tagging with bidirectional language models
  • 377
  • PDF
context2vec: Learning Generic Context Embedding with Bidirectional LSTM
  • 298
  • Highly Influential
  • PDF
Word Representations: A Simple and General Method for Semi-Supervised Learning
  • 2,026
  • PDF
Embeddings for Word Sense Disambiguation: An Evaluation Study
  • 198
  • PDF
Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
  • 526
  • PDF
Character-Aware Neural Language Models
  • 1,287
  • PDF
Neural Sequence Learning Models for Word Sense Disambiguation
  • 119
  • PDF
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
  • 4,235
  • PDF
Enriching Word Vectors with Subword Information
  • 4,607
  • PDF
...
1
2
3
4
5
...