LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging
@inproceedings{Rosenbaum2022LINGUISTLM, title={LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging}, author={Andrew Rosenbaum and Saleh Soltan and Wael Hamza and Yannick Versley and Markus Boese}, booktitle={International Conference on Computational Linguistics}, year={2022} }
We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In a 10-shot novel intent setting for the SNIPS dataset, LINGUIST surpasses state-of-the-art approaches (Back-Translation and Example Extrapolation) by a wide margin, showing absolute improvement for the target intents of +1.9 points on IC Recall and…
Figures and Tables from this paper
One Citation
CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing
- Computer ScienceAACL
- 2022
This work proposes CLASP, a simple method to improve low-resource SP for moderate-sized models: synthetic data from AlexaTM 20B is generated to augment the training set for a model 40x smaller (500M parameters) and shows significant improvements over strong baseline methods.
References
SHOWING 1-10 OF 50 REFERENCES
End-to-End Slot Alignment and Recognition for Cross-Lingual NLU
- Computer Science, LinguisticsEMNLP
- 2020
This work proposes a novel end-to-end model that learns to align and predict slots in a multilingual NLU system and uses the corpus to explore various cross-lingual transfer methods focusing on the zero-shot setting and leveraging MT for language expansion.
(Almost) Zero-Shot Cross-Lingual Spoken Language Understanding
- Computer Science, Linguistics2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
Different approaches to train a SLU component with little supervision for two new languages - Hindi and Turkish are examined, and it is shown that with only a few hundred labeled examples the authors can surpass the approaches proposed in the literature.
Data-Efficient Paraphrase Generation to Bootstrap Intent Classification and Slot Labeling for New Features in Task-Oriented Dialog Systems
- Computer ScienceCOLING
- 2020
This paper proposes a new, data-efficient approach using an interpretation-to-text model for paraphrase generation, and, in combination with shuffling-based sampling techniques, can obtain diverse and novel paraphrases from small amounts of seed data.
Language Models are Unsupervised Multitask Learners
- Computer Science
- 2019
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Improving Neural Machine Translation Models with Monolingual Data
- Computer ScienceACL
- 2016
This work pairs monolingual training data with an automatic back-translation, and can treat it as additional parallel training data, and obtains substantial improvements on the WMT 15 task English German, and for the low-resourced IWSLT 14 task Turkish->English.
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- Computer ScienceJ. Mach. Learn. Res.
- 2020
This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.
MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages
- Computer ScienceArXiv
- 2022
MASSIVE was created by tasking professional translators to localize the English-only SLURP dataset into 50 typologically diverse languages from 29 genera, and the dataset, modeling code, and models publicly.
Unsupervised Cross-lingual Representation Learning at Scale
- Computer ScienceACL
- 2020
It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.
Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems
- Computer ScienceNLP4CONVAI
- 2021
A multilingual paraphrase generation model that can be used to generate novel utterances for a target feature and target language and which shows promise across languages, even in a zero-shot setting where no seed data is available is proposed.
Finetuned Language Models Are Zero-Shot Learners
- Computer ScienceICLR
- 2022
It is shown that instruction tuning —finetuning language models on a collection of datasets described via instructions—substantially improves zero-shot performance on unseen tasks and outperforms few-shot GPT-3 by a large margin on ANLI, RTE, BoolQ, AI2-ARC, OpenbookQA, and StoryCloze.