Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models
@inproceedings{Yu2022AutomaticLS, title={Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models}, author={Zichun Yu and Tianyu Gao and Zhengyan Zhang and Yankai Lin and Zhiyuan Liu and Maosong Sun and Jie Zhou}, booktitle={International Conference on Computational Linguistics}, year={2022} }
Prompting, which casts downstream applications as language modeling tasks, has shown to be sample efficient compared to standard fine-tuning with pre-trained models. However, one pitfall of prompting is the need of manually-designed patterns, whose outcome can be unintuitive and requires large validation sets to tune. To tackle the challenge, we propose AutoSeq, a fully automatic prompting method: (1) We adopt natural language prompts on sequence-to-sequence models, enabling free-form…
References
SHOWING 1-10 OF 52 REFERENCES
Prompt-free and Efficient Few-shot Learning with Language Models
- Computer ScienceACL
- 2022
Experiments demonstrate that Perfect, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting, also outperforms existing state-of-the-art few- shot learning methods.
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
- Computer ScienceFINDINGS
- 2022
This work shows that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering, and recommends finetuned LMs for few- shot learning as it is more accurate, robust to different prompts, and can be made nearly as efficient as using frozen LMs.
Making Pre-trained Language Models Better Few-shot Learners
- Computer ScienceACL
- 2021
The LM-BFF approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning.
PPT: Pre-trained Prompt Tuning for Few-shot Learning
- Computer ScienceACL
- 2022
This work proposes to pre-train prompts by adding soft prompts into the pre-training stage to obtain a better initialization, and names this Pre-trained Prompt Tuning framework “PPT” to ensure the generalization of PPT.
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
- Computer ScienceICLR
- 2022
A novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners.
Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference
- Computer ScienceEACL
- 2021
This work introduces Pattern-Exploiting Training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task.
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- Computer ScienceJ. Mach. Learn. Res.
- 2020
This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.
The Power of Scale for Parameter-Efficient Prompt Tuning
- Computer ScienceEMNLP
- 2021
This work explores “prompt tuning”, a simple yet effective mechanism for learning “soft prompts” to condition frozen language models to perform specific downstream tasks, and shows that conditioning a frozen model with soft prompts confers benefits in robustness to domain transfer, as compared to full model tuning.
BARTScore: Evaluating Generated Text as Text Generation
- Computer ScienceNeurIPS
- 2021
This work conceptualizes the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models, and proposes a metric BARTS CORE with a number of variants that can be flexibly applied in an unsupervised fashion to evaluation of text from different perspectives.
Language Models are Few-Shot Learners
- Computer ScienceNeurIPS
- 2020
GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.