Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
@article{Gururangan2020DontSP, title={Don't Stop Pretraining: Adapt Language Models to Domains and Tasks}, author={Suchin Gururangan and Ana Marasovi{\'c} and Swabha Swayamdipta and Kyle Lo and Iz Beltagy and Doug Downey and Noah A. Smith}, journal={ArXiv}, year={2020}, volume={abs/2004.10964} }
Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to… Expand
Figures, Tables, and Topics from this paper
Paper Mentions
182 Citations
An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training
- Computer Science
- EMNLP
- 2020
- PDF
Task-specific Objectives of Pre-trained Language Models for Dialogue Adaptation
- Computer Science
- ArXiv
- 2020
- 1
- PDF
Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art
- Computer Science
- ClinicalNLP@EMNLP
- 2020
- Highly Influenced
- PDF
Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training
- Computer Science
- EMNLP
- 2020
- 1
- Highly Influenced
- PDF
Feature Adaptation of Pre-Trained Language Models across Languages and Domains for Text Classification
- Computer Science
- ArXiv
- 2020
- 1
- Highly Influenced
Go Simple and Pre-Train on Domain-Specific Corpora: On the Role of Training Data for Text Classification
- Computer Science
- COLING
- 2020
- 1
- Highly Influenced
- PDF
References
SHOWING 1-10 OF 75 REFERENCES
Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling
- Computer Science
- EMNLP/IJCNLP
- 2019
- 37
- PDF
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2020
- 1,024
- PDF
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
- Computer Science
- BlackboxNLP@EMNLP
- 2018
- 1,265
- PDF
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
- Computer Science
- RepL4NLP@ACL
- 2019
- 163
- PDF
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
- Computer Science
- NAACL-HLT
- 2019
- 46
- PDF