• Corpus ID: 209314627

Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning

@article{Ashukha2020PitfallsOI,
  title={Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning},
  author={Arsenii Ashukha and Alexander Lyzhov and Dmitry Molchanov and Dmitry P. Vetrov},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.06470}
}
Uncertainty estimation and ensembling methods go hand-in-hand. Uncertainty estimation is one of the main benchmarks for assessment of ensembling performance. At the same time, deep learning ensembles have provided state-of-the-art results in uncertainty estimation. In this work, we focus on in-domain uncertainty for image classification. We explore the standards for its quantification and point out pitfalls of existing metrics. Avoiding these pitfalls, we perform a broad study of different… 

Figures and Tables from this paper

Uncertainty Quantification and Deep Ensembles
TLDR
It is demonstrated that, although standard ensembling techniques certainly help to boost accuracy, the calibration of deep-ensembles relies on subtle trade-offs and, crucially, need to be executed after the averaging process.
Quantifying Epistemic Uncertainty in Deep Learning
TLDR
A theoretical framework to dissect the uncertainty, especially the epistemic component, in deep learning into procedural variability and data variability is provided, which is the first such attempt in the literature to the authors' best knowledge.
Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampling
TLDR
This work shows that modifying the sampling distributions for dropout layers in neural networks improves the quality of uncertainty estimation, and demonstrates that the diversification via determinantal point processes-based sampling achieves state-of-the-art results in uncertainty estimation for regression and classification tasks.
Depth Uncertainty in Neural Networks
TLDR
This work performs probabilistic reasoning over the depth of neural networks to exploit the sequential structure of feed-forward networks and provide uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines.
Self-Distribution Distillation: Efficient Uncertainty Estimation
TLDR
This work proposes a novel training approach, self-distribution distillation (S2D), which is able to efficiently train a single model that can estimate uncertainties, and shows that even a standard deep ensemble can be outperformed using S2D based ensembles and novel distilled models.
Improving Uncertainty Calibration of Deep Neural Networks via Truth Discovery and Geometric Optimization
TLDR
A truth discovery framework to integrate ensemble-based and post-hoc calibration methods, using the geometric variance of the ensemble candidates as a good indicator for sample uncertainty, and designing an accuracy-preserving truth estimator with provably no accuracy drop.
Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in Deep Learning
TLDR
This work proposes to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep neural networks and can be used post hoc with any set of pre-trained networks and only requires a small computational and memory overhead compared to regular ensembles.
Training-Free Uncertainty Estimation for Dense Regression: Sensitivity as a Surrogate
TLDR
Three simple and scalable methods to analyze the variance of outputs from a trained network under tolerable perturbations are proposed: infer-transformation, infer-noise, and infer-dropout, which produce comparable or even better uncertainty estimation when compared to training-required state-of-the-art methods.
SLURP: Side Learning Uncertainty for Regression Problems
TLDR
SLURP is proposed, a generic approach for regression uncertainty estimation via a side learner that exploits the output and the intermediate representations generated by the main task model and has a low computational cost with respect to existing solutions.
Uncertainty Estimation of Transformer Predictions for Misclassification Detection
TLDR
A vast empirical investigation of state-of-the-art UE methods for Transformer models on misclassification detection in named entity recognition and text classification tasks and two computationally efficient modifications are proposed, one of which approaches or even outperforms computationally intensive methods.
...
...

References

SHOWING 1-10 OF 59 REFERENCES
Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision
TLDR
This work proposes a comprehensive evaluation framework for scalable epistemic uncertainty estimation methods in deep learning and applies this framework to provide the first properly extensive and conclusive comparison of the two current state-of-the- art scalable methods: ensembling and MC-dropout.
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
TLDR
It is shown that the optima of these complex loss functions are in fact connected by simple curves over which training and test accuracy are nearly constant, and a training procedure is introduced to discover these high-accuracy pathways between modes.
Bayesian Uncertainty Estimation for Batch Normalized Deep Networks
TLDR
It is shown that training a deep network using batch normalization is equivalent to approximate inference in Bayesian models, and it is demonstrated how this finding allows us to make useful estimates of the model uncertainty.
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
TLDR
This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates.
Measuring Calibration in Deep Learning
TLDR
A comprehensive empirical study of choices in calibration measures including measuring all probabilities rather than just the maximum prediction, thresholding probability values, class conditionality, number of bins, bins that are adaptive to the datapoint density, and the norm used to compare accuracies to confidences.
Uncertainty Estimation via Stochastic Batch Normalization
TLDR
A probabilistic model is proposed and it is shown that Batch Normalization maximazes the lower bound of its marginalized log-likelihood, and an algorithm is designed which acts consistently during train and test.
A Simple Baseline for Bayesian Uncertainty in Deep Learning
TLDR
It is demonstrated that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.
A Scalable Laplace Approximation for Neural Networks
TLDR
This work uses recent insights from second-order optimisation for neural networks to construct a Kronecker factored Laplace approximation to the posterior over the weights of a trained network, enabling practitioners to estimate the uncertainty of models currently used in production without having to retrain them.
How to Train Deep Variational Autoencoders and Probabilistic Ladder Networks
TLDR
This work proposes three advances in training algorithms of variational autoencoders, for the first time allowing to train deep models of up to five stochastic layers, using a structure similar to the Ladder network as the inference model and shows state-of-the-art log-likelihood results for generative modeling on several benchmark datasets.
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
TLDR
A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.
...
...