LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

@inproceedings{Rosenbaum2022LINGUISTLM,
  title={LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging},
  author={Andrew Rosenbaum and Saleh Soltan and Wael Hamza and Yannick Versley and Markus Boese},
  booktitle={International Conference on Computational Linguistics},
  year={2022}
}
We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In a 10-shot novel intent setting for the SNIPS dataset, LINGUIST surpasses state-of-the-art approaches (Back-Translation and Example Extrapolation) by a wide margin, showing absolute improvement for the target intents of +1.9 points on IC Recall and… 

CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing

This work proposes CLASP, a simple method to improve low-resource SP for moderate-sized models: synthetic data from AlexaTM 20B is generated to augment the training set for a model 40x smaller (500M parameters) and shows significant improvements over strong baseline methods.

References

SHOWING 1-10 OF 50 REFERENCES

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

This work proposes a novel end-to-end model that learns to align and predict slots in a multilingual NLU system and uses the corpus to explore various cross-lingual transfer methods focusing on the zero-shot setting and leveraging MT for language expansion.

(Almost) Zero-Shot Cross-Lingual Spoken Language Understanding

Different approaches to train a SLU component with little supervision for two new languages - Hindi and Turkish are examined, and it is shown that with only a few hundred labeled examples the authors can surpass the approaches proposed in the literature.

Data-Efficient Paraphrase Generation to Bootstrap Intent Classification and Slot Labeling for New Features in Task-Oriented Dialog Systems

This paper proposes a new, data-efficient approach using an interpretation-to-text model for paraphrase generation, and, in combination with shuffling-based sampling techniques, can obtain diverse and novel paraphrases from small amounts of seed data.

Language Models are Unsupervised Multitask Learners

It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.

Improving Neural Machine Translation Models with Monolingual Data

This work pairs monolingual training data with an automatic back-translation, and can treat it as additional parallel training data, and obtains substantial improvements on the WMT 15 task English German, and for the low-resourced IWSLT 14 task Turkish->English.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages

MASSIVE was created by tasking professional translators to localize the English-only SLURP dataset into 50 typologically diverse languages from 29 genera, and the dataset, modeling code, and models publicly.

Unsupervised Cross-lingual Representation Learning at Scale

It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.

Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

A multilingual paraphrase generation model that can be used to generate novel utterances for a target feature and target language and which shows promise across languages, even in a zero-shot setting where no seed data is available is proposed.

Finetuned Language Models Are Zero-Shot Learners

It is shown that instruction tuning —finetuning language models on a collection of datasets described via instructions—substantially improves zero-shot performance on unseen tasks and outperforms few-shot GPT-3 by a large margin on ANLI, RTE, BoolQ, AI2-ARC, OpenbookQA, and StoryCloze.