• Corpus ID: 236493269

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

@article{Liu2021PretrainPA,
  title={Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing},
  author={Pengfei Liu and Weizhe Yuan and Jinlan Fu and Zhengbao Jiang and Hiroaki Hayashi and Graham Neubig},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.13586}
}
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub “prompt-based learning”. Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P (y|x), prompt-based learning is based on language models that model the probability of text directly. To use these models to perform prediction tasks, the original input x is modified using a template into a textual string prompt x′ that has some… 
NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task-Next Sentence Prediction
TLDR
This paper attempts to accomplish several NLP tasks in the zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models—Next Sentence Prediction (NSP).
Language Models in the Loop: Incorporating Prompting into Weak Supervision
TLDR
The experimental evaluation shows that prompting large language models within a weak supervision framework can provide gains in accuracy, and that this approach produces classifiers with comparable or superior accuracy to those trained from hand-engineered rules.
Conditional Prompt Learning for Vision-Language Models
TLDR
Conditional Context Optimization (CoCoOp), which extends CoOp by further learning a lightweight neural network to generate for each image an input-conditional token (vector), and yields stronger domain generalization performance as well.
Co-training Improves Prompt-based Learning for Large Language Models
TLDR
It is demonstrated that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data and cotraining makes it possible to improve the original prompt model and at the same time learn a smaller, downstream task-specific model.
Adaptive Prompt Learning-based Few-Shot Sentiment Analysis
TLDR
An adaptive prompting(AP) construction strategy using seq2seq-attention structure to acquire the semantic information of the input sequence and dynamically construct adaptive prompt which can not only improve the quality of the prompt, but also can generalize to other fields by pre-trained prompt which is constructed by existing public labeled data.
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
TLDR
A novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners.
Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5)
TLDR
A flexible and unified text-to-text paradigm called P5, which unifies various recommendation tasks in a shared framework, and possesses the potential to serve as the foundation model for downstream recommendation tasks, allows easy integration with other modalities, and enables instruction-based recommendation, which will revolutionize the technical form of recommender system towards universal recommendation engine.
Learning to Prompt for Vision-Language Models
TLDR
Context Optimization (CoOp) is proposed, a simple approach specifically for adapting CLIP-like vision-language models for downstream image recognition that requires as few as one or two shots to beat hand-crafted prompts with a decent margin and is able to gain significant improvements when using more shots.
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains
TLDR
This work presents PADA: An example-based autoregressive Prompt learning algorithm for on-the-fly Any-Domain Adaptation, based on the T5 language model, which substantially outperforms strong baselines.
Instruction Induction: From Few Examples to Natural Language Task Descriptions
TLDR
It is discovered that, to a large extent, the ability to generate instructions does indeed emerge when using a model that is both large enough and aligned to follow instructions; this surprising result suggests that instruction induction might be a viable learning paradigm in and of itself.
...
...

References

SHOWING 1-10 OF 31 REFERENCES
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts
TLDR
This work explores the idea of learning prompts by gradient descent—either fine-tuning prompts taken from previous work, or starting from random initialization, showing that the implicit factual knowledge in language models was previously underestimated.
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
TLDR
This work uses the generative nature of language models to construct an artificial development set and based on entropy statistics of the candidate permutations on this set, it identifies performant prompts and yields a 13% relative improvement for GPT-family models across eleven different established text classification tasks.
Pre-trained Models for Natural Language Processing: A Survey
TLDR
This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
Reordering Examples Helps during Priming-based Few-Shot Learning
TLDR
This work introduces PERO (Prompting with Examples in the Right Order), where it is shown that PERO can learn to generalize efficiently using as few as 10 examples, in contrast to existing approaches.
Zero-shot Text Classification With Generative Language Models
TLDR
This work investigates the use of natural language to enable zero-shot model adaptation to new tasks, using text and metadata from social commenting platforms as a source for a simple pretraining task and shows that natural language can serve as simple and powerful descriptors for task adaptation.
Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions
TLDR
This work uses the existing NLP datasets and the instructions used to crowdsource them to create NATURALINSTRUCTIONS, a dataset of instructions and task-specific input/output data that indicates that the existing models indeed benefit from instructions and hence, show improved generalization to new tasks.
Improving Language Understanding by Generative Pre-Training
TLDR
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
TLDR
The dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers, is introduced.
MASS: Masked Sequence to Sequence Pre-training for Language Generation
TLDR
This work proposes MAsked Sequence to Sequence pre-training (MASS) for the encoder-decoder based language generation tasks, which achieves the state-of-the-art accuracy on the unsupervised English-French translation, even beating the early attention-based supervised model.
...
...