Linguistic Knowledge and Transferability of Contextual Representations

@inproceedings{Liu2019LinguisticKA,
  title={Linguistic Knowledge and Transferability of Contextual Representations},
  author={Nelson F. Liu and Matt Gardner and Yonatan Belinkov and Matthew E. Peters and Noah A. Smith},
  booktitle={NAACL-HLT},
  year={2019}
}
Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language. [...] Key Result However, language model pretraining on more data gives the best results.Expand
273 Citations
On the Hierarchical Information in a Single Contextualised Word Representation (Student Abstract)
Quantifying the Contextualization of Word Representations with Semantic Class Probing
  • 1
Context Analysis for Pre-trained Masked Language Models
  • Highly Influenced
  • PDF
On the Linguistic Representational Power of Neural Machine Translation Models
  • 12
  • PDF
Probing Pretrained Language Models for Lexical Semantics
  • 14
  • Highly Influenced
  • PDF
Multilingual Probing of Deep Pre-Trained Contextual Encoders
  • 3
  • PDF
Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation
  • 6
  • PDF
Learning Better Sentence Representation with Syntax Information
  • PDF
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
  • 109
  • Highly Influenced
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 74 REFERENCES
Dissecting Contextual Word Embeddings: Architecture and Representation
  • 192
  • PDF
Deep contextualized word representations
  • 5,630
  • PDF
What do you learn from context? Probing for sentence structure in contextualized word representations
  • 298
  • PDF
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
  • 458
  • PDF
Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks
  • 297
  • PDF
Improving Language Understanding by Generative Pre-Training
  • 2,002
  • Highly Influential
  • PDF
Learned in Translation: Contextualized Word Vectors
  • 577
  • Highly Influential
  • PDF
Deep RNNs Encode Soft Hierarchical Syntax
  • 67
  • PDF
...
1
2
3
4
5
...