Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm

@article{Reynolds2021PromptPF,
  title={Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm},
  author={Laria Reynolds and Kyle McDonell},
  journal={Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems},
  year={2021}
}
  • Laria Reynolds, Kyle McDonell
  • Published 15 February 2021
  • Computer Science
  • Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems
Prevailing methods for mapping large generative language models to supervised tasks may fail to sufficiently probe models’ novel capabilities. Using GPT-3 as a case study, we show that 0-shot prompts can significantly outperform few-shot prompts. We suggest that the function of few-shot examples in these cases is better described as locating an already learned task rather than meta-learning. This analysis motivates rethinking the role of prompts in controlling and evaluating powerful language… 

Tables from this paper

Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting
TLDR
Collect and standardize prompts from a diverse range of tasks for use with tasks they were not designed for and evaluate these prompts across multiple choice datasets for a quantitative analysis of how certain attributes of a prompt affect performance.
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
TLDR
This paper proposes a novel data augmentation technique that leverages large-scale language models to generate realistic text samples from a mixture of real samples, and utilizes soft-labels predicted by the language models, effectively distilling knowledge from the large- scale language models and creating textual perturbations simultaneously.
Flamingo: a Visual Language Model for Few-Shot Learning
TLDR
It is demonstrated that a single Flamingo model can achieve a new state of the art for few-shot learning, simply by prompting the model with task-specific examples.
PaLM: Scaling Language Modeling with Pathways
TLDR
A 540-billion parameter, densely activated, Transformer language model, which is called PaLM achieves breakthrough performance, outperforming the state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark.
Continuous Entailment Patterns for Lexical Inference in Context
TLDR
In a direct comparison with discrete patterns, CONAN consistently leads to improved performance, setting a new state of the art in lexical inference in context and raising important questions regarding the understanding of PLMs using text patterns.
GPT-3 for Few-Shot Dialogue State Tracking
TLDR
It is found that natural language instructions in the prompt have little impact on performance, larger language models do not always induce higher downstream performance and that GPT-3 is highly sensitive to the order and number of the in-context examples.
Wordcraft: Story Writing With Large Language Models
TLDR
This work built Wordcraft, a text editor in which users collaborate with a generative language model to write a story, and shows that large language models enable novel co-writing experiences.
Inspecting the concept knowledge graph encoded by modern language models
TLDR
This work extracts the underlying knowledge graph of nine of the most influential language models of the last years, including word embeddings, text generators, and context encoders, and shows that all the models encode this knowledge, but suffer from several inaccuracies.
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems
TLDR
An end-to-end chatbot named the Few-Shot Bot is created, which automatically selects the most appropriate conversational skill, queries different knowledge bases or the internet, and uses the retrieved knowledge to generate a human-like response, all using only few dialogue examples per skill.
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
TLDR
The possibility of materializing the No Code AI paradigm by providing AI prototyping capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface is discussed and the performance benefits of prompt-based learning are shown and how it can be integrated into the prompt engineering pipeline.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 36 REFERENCES
CTRL: A Conditional Transformer Language Model for Controllable Generation
TLDR
CTRL is released, a 1.63 billion-parameter conditional transformer language model, trained to condition on control codes that govern style, content, and task-specific behavior, providing more explicit control over text generation.
Prefix-Tuning: Optimizing Continuous Prompts for Generation
TLDR
Prefix-tuning is proposed, a lightweight alternative to fine- Tuning for natural language generation tasks, which keeps language model parameters frozen and instead optimizes a sequence of continuous task-specific vectors, which is called the prefix.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Hierarchical Neural Story Generation
TLDR
This work collects a large dataset of 300K human-written stories paired with writing prompts from an online forum that enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text.
Low-Resource Generation of Multi-hop Reasoning Questions
TLDR
This paper first builds a multi-hop generation model and guides it to satisfy the logical rationality by the reasoning chain extracted from a given text, and applies it to the task of machine reading comprehension and achieves significant performance improvements.
The Curious Case of Neural Text Degeneration
TLDR
By sampling text from the dynamic nucleus of the probability distribution, which allows for diversity while effectively truncating the less reliable tail of the distribution, the resulting text better demonstrates the quality of human text, yielding enhanced diversity without sacrificing fluency and coherence.
Universal Language Model Fine-tuning for Text Classification
TLDR
This work proposes Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduces techniques that are key for fine- Tuning a language model.
Zero-shot Learning by Generating Task-specific Adapters
TLDR
HYPTER is introduced, a framework that improves zero-shot transferability by training a hypernetwork to generate task-specific adapters from task descriptions, and greatly reduces the number of parameters by using light-weight adapters.
TextWorld: A Learning Environment for Text-based Games
TLDR
TextWorld is a Python library that handles interactive play-through of text games, as well as backend functions like state tracking and reward assignment, and comes with a curated list of games whose features and challenges the authors have analyzed.
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
TLDR
This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image, and demonstrates that ReDAN can locate context-relevant visual and textual clues via iterative refinement, which can lead to the correct answer step-by-step.
...
1
2
3
4
...