Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

@article{Gururangan2020DontSP,
  title={Don't Stop Pretraining: Adapt Language Models to Domains and Tasks},
  author={Suchin Gururangan and Ana Marasovi{\'c} and Swabha Swayamdipta and Kyle Lo and Iz Beltagy and Doug Downey and Noah A. Smith},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.10964}
}
Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to… Expand
182 Citations
An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training
  • PDF
Domain and Task Adaptive Pretraining for Language Models
  • Highly Influenced
  • PDF
Task-specific Objectives of Pre-trained Language Models for Dialogue Adaptation
  • 1
  • PDF
Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art
  • Highly Influenced
  • PDF
Predictions For Pre-training Language Models
  • PDF
Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training
  • 1
  • Highly Influenced
  • PDF
Feature Adaptation of Pre-Trained Language Models across Languages and Domains for Text Classification
  • 1
  • Highly Influenced
Multi-Stage Pre-training for Low-Resource Domain Adaptation
  • PDF
Go Simple and Pre-Train on Domain-Specific Corpora: On the Role of Training Data for Text Classification
  • 1
  • Highly Influenced
  • PDF
Train No Evil: Selective Masking for Task-guided Pre-training
  • 5
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 75 REFERENCES
Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling
  • 37
  • PDF
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
  • 1,024
  • PDF
SciBERT: A Pretrained Language Model for Scientific Text
  • 376
  • PDF
Improving Language Understanding by Generative Pre-Training
  • 1,929
  • PDF
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
  • 1,265
  • PDF
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
  • 163
  • PDF
What to do about non-standard (or non-canonical) language in NLP
  • 34
  • PDF
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
  • 46
  • PDF
...
1
2
3
4
5
...