Corpus ID: 218487640

Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?

@article{Pruksachatkun2020IntermediateTaskTL,
  title={Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?},
  author={Yada Pruksachatkun and Jason Phang and Haokun Liu and Phu Mon Htut and Xiaoyi Zhang and Richard Yuanzhe Pang and C. Vania and K. Kann and Samuel R. Bowman},
  journal={ArXiv},
  year={2020},
  volume={abs/2005.00628}
}
  • Yada Pruksachatkun, Jason Phang, +6 authors Samuel R. Bowman
  • Published 2020
  • Computer Science
  • ArXiv
  • While pretrained models such as BERT have shown large gains across natural language understanding tasks, their performance can be improved by further training the model on a data-rich intermediate task, before fine-tuning it on a target task. However, it is still poorly understood when and why intermediate-task training is beneficial for a given target task. To investigate this, we perform a large-scale study on the pretrained RoBERTa model with 110 intermediate-target task combinations. We… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too
    2
    AdapterHub: A Framework for Adapting Transformers
    1
    Intermediate Training of BERT for Product Matching
    On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
    1

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 61 REFERENCES
    Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
    91
    Probing What Different NLP Tasks Teach Machines about Function Word Comprehension
    25
    BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
    39
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    8596
    Multi-Task Deep Neural Networks for Natural Language Understanding
    279
    Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs
    24
    CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
    96