Corpus ID: 218971783

Language Models are Few-Shot Learners

  title={Language Models are Few-Shot Learners},
  author={T. Brown and B. Mann and Nick Ryder and Melanie Subbiah and J. Kaplan and P. Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and Sandhini Agarwal and Ariel Herbert-Voss and G. Kr{\"u}ger and Tom Henighan and R. Child and Aditya Ramesh and D. Ziegler and Jeffrey Wu and Clemens Winter and Christopher Hesse and Mark Chen and E. Sigler and Mateusz Litwin and Scott Gray and Benjamin Chess and J. Clark and Christopher Berner and Sam McCandlish and A. Radford and Ilya Sutskever and Dario Amodei},
  • T. Brown, B. Mann, +28 authors Dario Amodei
  • Published 2020
  • Computer Science
  • ArXiv
  • Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle… CONTINUE READING
    Artificial Neural Networks Accurately Predict Language Processing in the Brain
    Measuring Massive Multitask Language Understanding
    Word meaning in minds and machines
    A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
    Generative Language Modeling for Automated Theorem Proving
    Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems
    Natural Backdoor Attack on Text Data
    Data Movement Is All You Need: A Case Study of Transformer Networks
    Data Movement Is All You Need: A Case Study on Optimizing Transformers


    Publications referenced by this paper.