Cross-Task Generalization via Natural Language Crowdsourcing Instructions

@inproceedings{Mishra2022CrossTaskGV,
  title={Cross-Task Generalization via Natural Language Crowdsourcing Instructions},
  author={Swaroop Mishra and Daniel Khashabi and Chitta Baral and Hannaneh Hajishirzi},
  booktitle={ACL},
  year={2022}
}
Humans (e.g., crowdworkers) have a remarkable ability in solving different tasks, by simply reading textual instructions that define them and looking at a few examples. Despite the success of the conventional supervised learning on individual datasets, such models often struggle with generalization across tasks (e.g., a question-answering system cannot solve classification tasks). A long-standing challenge in AI is to build a model that learns a new task by understanding the human-readable… 

CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP

TLDR
This paper presents the NLP Few-shot Gym, a repository of 160 diverse few-shot NLP tasks created from open-access NLP datasets and converted to a unified text-to-text format, and reveals that the few- shot learning ability on unseen tasks can be improved via an upstream learning stage using a set of seen tasks.

CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations

TLDR
To model the influence of explanations in classifying an example, ExEnt is developed, an entailment-based model that learns classifiers using explanations that generalizes up to 18% better on novel tasks than a baseline that does not use explanations.

One-Shot Learning from a Demonstration with Hierarchical Latent Language

TLDR
This work proposes a neural agent infused with hierarchical latent language—both at the level of task inference and subtask planning that is able to generalize unseen task-performing procedures and generalize their execution to other contexts.

Continual-T0: Progressively Instructing 50+ Tasks to Language Models Without Forgetting

TLDR
The resulting model Continual-T0 (CT0) is able to learn diverse new tasks, while still maintaining good performance on previous tasks, spanning remarkably through 70 datasets in total.

Instruction Induction: From Few Examples to Natural Language Task Descriptions

TLDR
It is discovered that, to a large extent, the ability to generate instructions does indeed emerge when using a model that is both large enough and aligned to follow instructions; this surprising result suggests that instruction induction might be a viable learning paradigm in and of itself.

InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

TLDR
This work proposes a multi-task instruction-based generative framework, named InstructionNER, for low-resource named entity recognition, which reformulate the NER task as a generation problem, which enriches source sentences with task-specific instructions and answer options, then inferences the entities and types in natural language.

Few-shot Adaptation Works with UnpredicTable Data

TLDR
This work automatically extracting 413,299 tasks from internet tables - orders of magnitude more than the next-largest public datasets - and finds that narrow subsets of the authors' dataset sometimes outperform more diverse datasets.

QuExEnt: Improved Zero-Shot Classification from Explanations Through Quantifier Modeling and Curriculum Learning

TLDR
To learn better zero-shot classifiers from explanations by using three strategies that model the semantics of quantifiers present in explanations, aggregating information from multiple explanations using an attention-based mechanism, and model training via curriculum learning from tasks with simple explanations to tasks with complex explanations.

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

TLDR
This paper presents a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any taskspecific data: a unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectionalPLM.

Unsupervised Cross-Task Generalization via Retrieval Augmentation

TLDR
This paper proposes a retrieval-augmentation method named ReCross that takes a few unlabelled examples as queries to retrieve a small subset of upstream data and uses them to update the multi-task model for better generalization.
...

References

SHOWING 1-10 OF 50 REFERENCES

CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP

TLDR
This paper presents the NLP Few-shot Gym, a repository of 160 diverse few-shot NLP tasks created from open-access NLP datasets and converted to a unified text-to-text format, and reveals that the few- shot learning ability on unseen tasks can be improved via an upstream learning stage using a set of seen tasks.

Learning from Task Descriptions

TLDR
This work introduces a framework for developing NLP systems that solve new tasks after reading their descriptions, synthesizing prior work in this area, and instantiates it with a new English language dataset, ZEST, structured for task-oriented evaluation on unseen tasks.

The Turking Test: Can Language Models Understand Instructions?

TLDR
The Turking Test is presented, which examines a model's ability to follow natural language instructions of varying complexity, and reveals that a large pretrained language model performs poorly across all tasks.

Few-Shot Text Generation with Natural Language Instructions

TLDR
GenPET, a method for text generation that is based on pattern-exploiting training, a recent approach for combining textual instructions with supervised learning that only works for classification tasks, is introduced.

Multitask Prompted Training Enables Zero-Shot Task Generalization

TLDR
A system for easily mapping any natural language tasks into a human-readable prompted form and fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks.

Finetuned Language Models Are Zero-Shot Learners

TLDR
It is shown that instruction tuning —finetuning language models on a collection of datasets described via instructions—substantially improves zero-shot performance on unseen tasks and outperforms few-shot GPT-3 by a large margin on ANLI, RTE, BoolQ, AI2-ARC, OpenbookQA, and StoryCloze.

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

TLDR
Meta-tuning is proposed, which directly optimizes the zero-shot learning objective by finetuning pre-trained language models on a collection of datasets by aggregating 43 existing datasets and annotating 441 label descriptions in a question-answering (QA) format.

DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

TLDR
A new reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs, and presents a new model that combines reading comprehension methods with simple numerical reasoning to achieve 51% F1.

Calibrate Before Use: Improving Few-Shot Performance of Language Models

TLDR
This work first estimates the model's bias towards each answer by asking for its prediction when given the training prompt and a content-free test input such as "N/A", and then fits calibration parameters that cause the prediction for this input to be uniform across answers.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.