Corpus ID: 237503353

Types of Out-of-Distribution Texts and How to Detect Them

@inproceedings{Arora2021TypesOO,
  title={Types of Out-of-Distribution Texts and How to Detect Them},
  author={Udit Arora and William Huang and He He},
  booktitle={EMNLP},
  year={2021}
}
Despite agreement on the importance of detecting out-of-distribution (OOD) examples, there is little consensus on the formal definition of OOD examples and how to best detect them. We categorize these examples by whether they exhibit a background shift or a semantic shift, and find that the two major approaches to OOD detection, model calibration and density estimation (language modeling for text), have distinct behavior on these types of OOD data. Across 14 pairs of in-distribution and OOD… Expand

References

SHOWING 1-10 OF 60 REFERENCES
Contrastive Training for Improved Out-of-Distribution Detection
TLDR
This paper proposes and investigates the use of contrastive training to boost OOD detection performance, and introduces and employs the Confusion Log Probability (CLP) score, which quantifies the difficulty of the Ood detection task by capturing the similarity of inlier and outlier datasets. Expand
Contrastive Out-of-Distribution Detection for Pretrained Transformers
TLDR
This paper studies the OoD detection problem for pretrained transformers using only in-distribution data in training and proposes a contrastive loss that improves the compactness of representations, such that OoD instances can be better differentiated from in-Distribution ones. Expand
Why Normalizing Flows Fail to Detect Out-of-Distribution Data
TLDR
This work demonstrates that flows learn local pixel correlations and generic image-to-latent-space transformations which are not specific to the target image dataset, and shows that by modifying the architecture of flow coupling layers the authors can bias the flow towards learning the semantic structure of the target data, improving OOD detection. Expand
Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
TLDR
The proposed regularized fine-tuning method outperforms existing calibration methods for text classification in terms of expectation calibration error, misclassification detection, and OOD detection on six datasets. Expand
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
TLDR
There is substantial room for improvement in NLI systems, and the HANS dataset can motivate and measure progress in this area, which contains many examples where the heuristics fail. Expand
Detecting semantic anomalies
TLDR
It is argued that out-distributions of practical interest are ones where the distinction is semantic in nature for a specified context, and that evaluative tasks should reflect this more closely. Expand
Out-of-Domain Detection for Natural Language Understanding in Dialog Systems
TLDR
A novel model is proposed to generate high-quality pseudo OOD samples that are akin to IN-Domain (IND) input utterances and thereby improves the performance of OOD detection and is demonstrated to be effective in NLU. Expand
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
TLDR
A simple baseline that utilizes probabilities from softmax distributions is presented, showing the effectiveness of this baseline across all computer vision, natural language processing, and automatic speech recognition, and it is shown the baseline can sometimes be surpassed. Expand
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
TLDR
A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks. Expand
Out-of-Domain Detection for Low-Resource Text Classification Tasks
TLDR
Evaluations on real-world datasets show that the proposed solution outperforms state-of-the-art methods in zero-shot OOD detection task, while maintaining a competitive performance on ID classification task. Expand
...
1
2
3
4
5
...