Corpus ID: 218971783

Language Models are Few-Shot Learners

@article{Brown2020LanguageMA,
  title={Language Models are Few-Shot Learners},
  author={T. Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and J. Kaplan and Prafulla Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and Sandhini Agarwal and Ariel Herbert-Voss and Gretchen Krueger and T. Henighan and R. Child and A. Ramesh and Daniel M. Ziegler and Jeff Wu and Clemens Winter and Christopher Hesse and Mark Chen and Eric Sigler and Mateusz Litwin and Scott Gray and Benjamin Chess and J. Clark and Christopher Berner and Sam McCandlish and Alec Radford and Ilya Sutskever and Dario Amodei},
  journal={ArXiv},
  year={2020},
  volume={abs/2005.14165}
}
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle… Expand
Exploring Large Language Models in a Limited Resource Scenario
When Do You Need Billions of Words of Pretraining Data?
CPM: A Large-scale Generative Chinese Pre-trained Language Model
Learning from Task Descriptions
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 144 REFERENCES
Story Ending Prediction by Transferable BERT
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Optimization as a Model for Few-Shot Learning
...
1
2
3
4
5
...