Learning to Classify Intents and Slot Labels Given a Handful of Examples

@article{Krone2020LearningTC,
  title={Learning to Classify Intents and Slot Labels Given a Handful of Examples},
  author={Jason Krone and Yi Zhang and Mona T. Diab},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.10793}
}
Intent classification (IC) and slot filling (SF) are core components in most goal-oriented dialogue systems. Current IC/SF models perform poorly when the number of training examples per class is small. We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios. We establish a few-shot IC/SF benchmark by defining few-shot splits for three public IC/SF datasets, ATIS, TOP… 

Figures and Tables from this paper

Strategies to Improve Few-shot Learning for Intent Classification and Slot-Filling

This work systematically investigated how contrastive learning and data augmentation methods can benefit these existing meta-learning pipelines for jointly modelled IC/SF tasks, and proposed approaches consistently outperform the existing state-of-the-art for both IC and SF tasks.

An Explicit-Joint and Supervised-Contrastive Learning Framework for Few-Shot Intent Classification and Slot Filling

This paper proposes a novel explicit-joint and supervised-contrastive learning framework for few-shot intent classification and slot filling, and follows a not common but practical way to construct the episode, which gets rid of the traditional set-ting with way and shot, and allows for unbalanced datasets.

Semi-Supervised Few-Shot Intent Classification and Slot Filling

These proposed semi-supervised approaches outperform standard supervised meta-learning methods: contrastive losses in conjunction with prototypical networks consistently outperform the existing state-of-the-art for both IC and SF tasks, while data augmentation strategies primarily improve few-shot IC by a significant margin.

Meta Learning to Classify Intent and Slot Labels with Noisy Few Shot Examples

A novel noise-robust few-shot SLU model based on prototypical networks is proposed, which consistently outperforms the conventional fine-tuning baseline and another popular meta-learning method, Model-Agnostic Meta-Learning (MAML), in terms of achieving better IC accuracy and SL F1, and yielding smaller performance variation when noises are present.

Knowledge Distillation Meets Few-Shot Learning: An Approach for Few-Shot Intent Classification Within and Across Domains

This paper introduces an approach for distilling small models that generalize to new intent classes and domains using only a handful of labeled examples, and conducts experiments on public intent classification benchmarks, confirming the generalization ability of the small distilled models while having lower computational costs.

Learning to Bridge Metric Spaces: Few-shot Joint Learning of Intent Detection and Slot Filling

A similarity-based few-shot learning scheme is proposed, named Contrastive Prototype Merging network (ConProm), that learns to bridge metric spaces of intent and slot on data-rich domains, and then adapt the bridged metric space to specific few- shot domain.

Few-shot Intent Classification and Slot Filling with Retrieved Examples

This paper proposes a span-level retrieval method that learns similar contextualized representations for spans with the same label via a novel batch-softmax objective and uses the labels of the retrieved spans to construct the final structure with the highest aggregated score.

ConVEx: Data-Efficient and Few-Shot Slot Labeling

ConVEx’s reduced pretraining times and cost, along with its efficient fine-tuning and strong performance, promise wider portability and scalability for data-efficient sequence-labeling tasks in general.

FewJoint: few-shot learning for joint dialogue understanding

This paper introduces FewJoint, the first FSL benchmark for joint dialogue understanding, and guides slot with explicit intent information and proposes a novel trust gating mechanism that blocks low-confidence intent information to ensure high quality sharing.

Label Semantic Aware Pre-training for Few-shot Text Classification

LSAP incorporates label semantics into pre-trained generative models (T5 in this case) by performing secondary pre-training on labeled sentences from a variety of domains by developing a filtering and labeling pipeline to automatically create sentence-label pairs from unlabeled text.

References

SHOWING 1-10 OF 36 REFERENCES

Simple, Fast, Accurate Intent Classification and Slot Labeling for Goal-Oriented Dialogue Systems

This work design framework for a modularization of joint IC-SL task to enhance architecture transparency is designed, and a number of self-attention, convolutional, and recurrent models are explored, contributing a large-scale analysis of modeling paradigms for IC+SL across two datasets.

Induction Networks for Few-Shot Text Classification

This paper proposes a novel Induction Network to learn a generalized class-wise representation of each class in the support set, by innovatively leveraging the dynamic routing algorithm in meta-learning and finds the model is able to induce and generalize better.

Few-Shot Text Classification with Induction Network

A novel Induction Network is proposed to learn generalized class-wise representations in few-shot text classification, innovatively combining the dynamic routing algorithm with the typical meta learning framework, and is able to induce from particularity to university, which is a more human-like learning approach.

Prototypical Networks for Few-shot Learning

This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning.

Diverse Few-Shot Text Classification with Multiple Metrics

This work proposes an adaptive metric learning approach that automatically determines the best weighted combination from a set of metrics obtained from meta-training tasks for a newly seen few-shot task.

FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

Empirical results show that even the most competitive few- shot learning models struggle on this task, especially as compared with humans, and indicate that few-shot relation classification remains an open problem and still requires further research.

Optimization as a Model for Few-Shot Learning

Matching Networks for One Shot Learning

This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples

This work proposes Meta-Dataset: a new benchmark for training and evaluating models that is large-scale, consists of diverse datasets, and presents more realistic tasks, and proposes a new set of baselines for quantifying the benefit of meta-learning in Meta- Dataset.

Attentive Task-Agnostic Meta-Learning for Few-Shot Text Classification

The proposed ATAML method generalizes better on single-label and multi-label classification tasks in miniRCV1 and miniReuters-21578 datasets and is designed to encourage task-agnostic representation learning by way of task-gnostic parameterization and facilitate task-specific adaptation via attention mechanisms.