GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation

@article{Chen2021GOLDIO,
  title={GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation},
  author={Derek Chen and Zhou Yu},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.03079}
}
Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples. We introduce GOLD as an orthogonal technique that augments existing data to… 

Figures and Tables from this paper

DG2: Data Augmentation Through Document Grounded Dialogue Generation

TLDR
An automatic data augmentation technique grounded on documents through a generative dialogue model that consists of a user bot and agent bot that can synthesize diverse dialogues given an input document which is then used to train a downstream model.

Metric Learning and Adaptive Boundary for Out-of-Domain Detection

TLDR
This work has designed an OOD detection algorithm independent of OOD data that outperforms a wide range of current state-of-the-art algorithms on publicly available datasets and is based on a simple butcient approach of combining metric learning with adaptive decision boundary.

Knowledge-Grounded Conversational Data Augmentation with Generative Conversational Networks

TLDR
The results show that for conversations without knowledge grounding, GCN can generalize from the seed data, producing novel conversations that are less relevant but more engaging and for knowledge-grounded conversations, it can produce more knowledge-focused, fluent, and engaging conversations.

Data Augmentation for Intent Classification

TLDR
It is found that while certain methods dramatically improve qualitative and quantitative performance, other methods have minimal or even negative impact.

POEM: Out-of-Distribution Detection with Posterior Sampling

TLDR
A novel posterior sampling-based outlier mining framework, POEM, is proposed, which facilitates the use of outlier data and promotes learning a compact decision boundary between ID and OOD data for improved detection.

Data Augmentation for Intent Classification

TLDR
It is found that while certain methods dramatically improve qualitative and quantitative performance, other methods have minimal or even negative impact.

References

SHOWING 1-10 OF 64 REFERENCES

Improving Dialogue Breakdown Detection with Semi-Supervised Learning

TLDR
The use of semi-supervised learning methods to improve dialogue breakdown detection, including continued pre-training on the Reddit dataset and a manifold-based data augmentation method, are investigated.

Out-of-Domain Detection for Natural Language Understanding in Dialog Systems

TLDR
A novel model is proposed to generate high-quality pseudo OOD samples that are akin to IN-Domain (IND) input utterances and thereby improves the performance of OOD detection and is demonstrated to be effective in NLU.

Likelihood Ratios and Generative Classifiers for Unsupervised Out-of-Domain Detection In Task Oriented Dialog

TLDR
This work is hitherto the first to investigate the use of generative classifier and computing a marginal likelihood (ratio) for OOD detection at test-time and finds that this approach outperforms both simple likelihood (Ratio) based and other prior approaches.

Automatically Learning Data Augmentation Policies for Dialogue Tasks

TLDR
This work adapts AutoAugment to automatically discover effective perturbation policies for natural language processing (NLP) tasks such as dialogue generation, and achieves significant improvements over the previous state-of-the-art, including trained on manually-designed policies.

Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset

TLDR
This work introduces the initial release of the Taskmaster-1 dataset which includes 13,215 task-based dialogs comprising six domains and offers several baseline models including state of the art neural seq2seq architectures with benchmark performance as well as qualitative human evaluations.

Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding

TLDR
A sequence-to-sequence generation based data augmentation framework that leverages one utterance’s same semantic alternatives in the training data to produce diverse utterances that help to improve the language understanding module.

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

TLDR
The broader analysis shows that the reason for success lies in the fact that the fine-tuned Transformer is capable of constructing homogeneous representations of in-domain utterances, revealing geometrical disparity to out of domain utterances and the Mahalanobis distance captures this disparity easily.

Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog

TLDR
This paper presents a new data set of 57k annotated utterances in English, Spanish, Spanish and Thai and uses this data set to evaluate three different cross-lingual transfer methods, finding that given several hundred training examples in the the target language, the latter two methods outperform translating the training data.

KLOOS: KL Divergence-based Out-of-Scope Intent Detection in Human-to-Machine Conversations

TLDR
An out-of-scope intent detection method, called KLOOS, based on a novel feature extraction mechanism that incorporates the information accumulation of sequential word processing, which statistically significantly improves out- of-scope sensitivity in all cases.

Joint Learning of Domain Classification and Out-of-Domain Detection with Dynamic Class Weighting for Satisficing False Acceptance Rates

TLDR
A neural joint learning model for domain classification and OOD detection is introduced, where dynamic class weighting is used during the model training to satisfice a given OOD false acceptance rate (FAR) while maximizing the domain classification accuracy.
...