Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models

  title={Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models},
  author={Zichun Yu and Tianyu Gao and Zhengyan Zhang and Yankai Lin and Zhiyuan Liu and Maosong Sun and Jie Zhou},
  booktitle={International Conference on Computational Linguistics},
Prompting, which casts downstream applications as language modeling tasks, has shown to be sample efficient compared to standard fine-tuning with pre-trained models. However, one pitfall of prompting is the need of manually-designed patterns, whose outcome can be unintuitive and requires large validation sets to tune. To tackle the challenge, we propose AutoSeq, a fully automatic prompting method: (1) We adopt natural language prompts on sequence-to-sequence models, enabling free-form… 

Figures and Tables from this paper



Prompt-free and Efficient Few-shot Learning with Language Models

Experiments demonstrate that Perfect, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting, also outperforms existing state-of-the-art few- shot learning methods.

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

This work shows that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering, and recommends finetuned LMs for few- shot learning as it is more accurate, robust to different prompts, and can be made nearly as efficient as using frozen LMs.

Making Pre-trained Language Models Better Few-shot Learners

The LM-BFF approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning.

PPT: Pre-trained Prompt Tuning for Few-shot Learning

This work proposes to pre-train prompts by adding soft prompts into the pre-training stage to obtain a better initialization, and names this Pre-trained Prompt Tuning framework “PPT” to ensure the generalization of PPT.

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

A novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners.

Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference

This work introduces Pattern-Exploiting Training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

The Power of Scale for Parameter-Efficient Prompt Tuning

This work explores “prompt tuning”, a simple yet effective mechanism for learning “soft prompts” to condition frozen language models to perform specific downstream tasks, and shows that conditioning a frozen model with soft prompts confers benefits in robustness to domain transfer, as compared to full model tuning.

BARTScore: Evaluating Generated Text as Text Generation

This work conceptualizes the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models, and proposes a metric BARTS CORE with a number of variants that can be flexibly applied in an unsupervised fashion to evaluation of text from different perspectives.

Language Models are Few-Shot Learners

GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.