• Corpus ID: 209314627

Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning

@article{Ashukha2020PitfallsOI,
  title={Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning},
  author={Arsenii Ashukha and Alexander Lyzhov and Dmitry Molchanov and Dmitry P. Vetrov},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.06470}
}
Uncertainty estimation and ensembling methods go hand-in-hand. Uncertainty estimation is one of the main benchmarks for assessment of ensembling performance. At the same time, deep learning ensembles have provided state-of-the-art results in uncertainty estimation. In this work, we focus on in-domain uncertainty for image classification. We explore the standards for its quantification and point out pitfalls of existing metrics. Avoiding these pitfalls, we perform a broad study of different… 

Figures and Tables from this paper

Uncertainty Quantification and Deep Ensembles
TLDR
It is demonstrated that, although standard ensembling techniques certainly help to boost accuracy, the calibration of deep-ensembles relies on subtle trade-offs and, crucially, need to be executed after the averaging process.
Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampling
TLDR
This work shows that modifying the sampling distributions for dropout layers in neural networks improves the quality of uncertainty estimation, and demonstrates that the diversification via determinantal point processes-based sampling achieves state-of-the-art results in uncertainty estimation for regression and classification tasks.
Depth Uncertainty in Neural Networks
TLDR
This work performs probabilistic reasoning over the depth of neural networks to exploit the sequential structure of feed-forward networks and provide uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines.
Self-Distribution Distillation: Efficient Uncertainty Estimation
TLDR
This work proposes a novel training approach, self-distribution distillation (S2D), which is able to efficiently train a single model that can estimate uncertainties, and shows that even a standard deep ensemble can be outperformed using S2D based ensembles and novel distilled models.
Improving Uncertainty Calibration of Deep Neural Networks via Truth Discovery and Geometric Optimization
TLDR
A truth discovery framework to integrate ensemble-based and post-hoc calibration methods, using the geometric variance of the ensemble candidates as a good indicator for sample uncertainty, and designing an accuracy-preserving truth estimator with provably no accuracy drop.
Training-Free Uncertainty Estimation for Dense Regression: Sensitivity as a Surrogate
TLDR
A systematic exploration into training-free uncertainty estimation for dense regression, an unrecognized yet important problem, and a theoretical construction justifying such estimations are provided, which produce comparable or even better uncertainty estimation when compared to training-required state-of-the-art methods.
Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in Deep Learning
TLDR
This work proposes to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep neural networks and can be used post hoc with any set of pre-trained networks and only requires a small computational and memory overhead compared to regular ensembles.
UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs
TLDR
This work presents an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs), and improves over the OoD detection and FP detection performance of state-of-the-art GAN-training based classi fiers.
Masksembles for Uncertainty Estimation
TLDR
Instead of randomly dropping parts of the network as in MC-dropout, Masksemble relies on a fixed number of binary masks, which are parameterized in a way that allows to change correlations between individual models.
SLURP: Side Learning Uncertainty for Regression Problems
TLDR
SLURP is proposed, a generic approach for regression uncertainty estimation via a side learner that exploits the output and the intermediate representations generated by the main task model and has a low computational cost with respect to existing solutions.
...
...

References

SHOWING 1-10 OF 59 REFERENCES
Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision
TLDR
This work proposes a comprehensive evaluation framework for scalable epistemic uncertainty estimation methods in deep learning and applies this framework to provide the first properly extensive and conclusive comparison of the two current state-of-the- art scalable methods: ensembling and MC-dropout.
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
TLDR
It is shown that the optima of these complex loss functions are in fact connected by simple curves over which training and test accuracy are nearly constant, and a training procedure is introduced to discover these high-accuracy pathways between modes.
Bayesian Uncertainty Estimation for Batch Normalized Deep Networks
TLDR
It is shown that training a deep network using batch normalization is equivalent to approximate inference in Bayesian models, and it is demonstrated how this finding allows us to make useful estimates of the model uncertainty.
Measuring Calibration in Deep Learning
TLDR
A comprehensive empirical study of choices in calibration measures including measuring all probabilities rather than just the maximum prediction, thresholding probability values, class conditionality, number of bins, bins that are adaptive to the datapoint density, and the norm used to compare accuracies to confidences.
A Simple Baseline for Bayesian Uncertainty in Deep Learning
TLDR
It is demonstrated that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.
A Scalable Laplace Approximation for Neural Networks
TLDR
This work uses recent insights from second-order optimisation for neural networks to construct a Kronecker factored Laplace approximation to the posterior over the weights of a trained network, enabling practitioners to estimate the uncertainty of models currently used in production without having to retrain them.
How to Train Deep Variational Autoencoders and Probabilistic Ladder Networks
TLDR
This work proposes three advances in training algorithms of variational autoencoders, for the first time allowing to train deep models of up to five stochastic layers, using a structure similar to the Ladder network as the inference model and shows state-of-the-art log-likelihood results for generative modeling on several benchmark datasets.
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
TLDR
A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.
Bayesian Inference for Large Scale Image Classification
TLDR
ATMC, an adaptive noise MCMC algorithm that estimates and is able to sample from the posterior of a neural network, is introduced and is shown to be intrinsically robust to overfitting on the training data and to provide a better calibrated measure of uncertainty compared to the optimization baseline.
On Calibration of Modern Neural Networks
TLDR
It is discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated, and on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions.
...
...