Corpus ID: 235899139

Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks

  title={Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks},
  author={A. Malinin and Neil Band and German Chesnokov and Y. Gal and M. Gales and A. Noskov and Andrey Ploskonosov and L. Prokhorenkova and Ivan Provilkov and Vatsal Raina and Vyas Raina and Mariya Shmatova and Panos Tigas and Boris Yangel},
There has been significant research done on developing methods for improving robustness to distributional shift and uncertainty estimation. In contrast, only limited work has examined developing standard datasets and benchmarks for assessing these approaches. Additionally, most work on uncertainty estimation and robustness has developed new techniques based on small-scale regression or image classification tasks. However, many tasks of practical interest have different modalities, such as… Expand


Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
A large-scale benchmark of existing state-of-the-art methods on classification problems and the effect of dataset shift on accuracy and calibration is presented, finding that traditional post-hoc calibration does indeed fall short, as do several other previous methods. Expand
Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness
This paper investigates using Prior Networks to detect adversarial attacks and proposes a generalized form of adversarial training, and shows that the appropriate training criterion for Prior Networks is the reverse KL-divergence between Dirichlet distributions. Expand
Predictive Uncertainty Estimation via Prior Networks
This work proposes a new framework for modeling predictive uncertainty called Prior Networks (PNs) which explicitly models distributional uncertainty by parameterizing a prior distribution over predictive distributions and evaluates PNs on the tasks of identifying out-of-distribution samples and detecting misclassification on the MNIST dataset, where they are found to outperform previous methods. Expand
Generalized ODIN: Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data
This work bases its work on a popular method ODIN, proposing two strategies for freeing it from the needs of tuning with OoD data, while improving its OoD detection performance, and proposing to decompose confidence scoring as well as a modified input pre-processing method. Expand
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Expand
Uncertainty estimation in deep learning with application to spoken language assessment
Prior Networks combine the advantages of ensemble and single-model approaches to estimating uncertainty and are evaluated on a range classification datasets, where they are shown to outperform baseline approaches on the task of detecting out-of-distribution inputs. Expand
Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets
This work proposes a new training objective which minimizes the reverse KL-divergence to a Proxy-Dirichlet target derived from the ensemble, and shows that for the Dirichlet log-likelihood criterion classes with low probability induce larger gradients than high-probability classes. Expand
Improving Deterministic Uncertainty Estimation in Deep Learning for Classification and Regression
We propose a new model that estimates uncertainty in a single forward pass and works on both classification and regression problems. Our approach combines a bi-Lipschitz feature extractor with anExpand
MTNT: A Testbed for Machine Translation of Noisy Text
This paper proposes a benchmark dataset for Machine Translation of Noisy Text (MTNT), consisting of noisy comments on Reddit and professionally sourced translations, and demonstrates that existing MT models fail badly on a number of noise-related phenomena, even after performing adaptation on a small training set of in-domain data. Expand
Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning
This work focuses on in-domain uncertainty for image classification and introduces the deep ensemble equivalent (DEE) and shows that many sophisticated ensembling techniques are equivalent to an ensemble of very few independently trained networks in terms of the test log-likelihood. Expand