Corpus ID: 235435823

Revisiting the Calibration of Modern Neural Networks

  title={Revisiting the Calibration of Modern Neural Networks},
  author={Matthias Minderer and Josip Djolonga and Rob Romijnders and F. Hubis and Xiaohua Zhai and N. Houlsby and Dustin Tran and Mario Lucic},
Accurate estimation of predictive uncertainty (model calibration) is essential for the safe application of neural networks. Many instances of miscalibration in modern neural networks have been reported, suggesting a trend that newer, more accurate models produce poorly calibrated predictions. Here, we revisit this question for recent state-of-the-art image classification models. We systematically relate model calibration and accuracy, and find that the most recent models, notably those not… Expand
A Tale Of Two Long Tails
The results show that well-designed interventions over the course of training can be an effective way to characterize and distinguish between different sources of uncertainty, suggesting that well the rate of learning in the presence of additional information differs between atypical and noisy examples. Expand
Exploring the Limits of Out-of-Distribution Detection
It is demonstrated that large-scale pre-trained transformers can significantly improve the state-of-the-art (SOTA) on a range of near OOD tasks across different data modalities, and a new way of using just the names of outlier classes as a sole source of information without any accompanying images is explored. Expand
Improving Entropic Out-of-Distribution Detection using Isometric Distances and the Minimum Distance Score
This paper proposes to perform an isometrization of the distances used in the IsoMax loss and replaces the entropic score with the minimum distance score, and shows that these simple modifications increase out-of-distribution detection performance while keeping the solution seamless. Expand


On Calibration of Modern Neural Networks
It is discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated, and on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions. Expand
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Expand
Calibration of Neural Networks using Splines
This work introduces a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test in which the main idea is to compare the respective cumulative probability distributions. Expand
Temporal Probability Calibration
This paper considers calibrating models that produce class probability estimates from sequences of data, focusing on the case where predictions are obtained from incomplete sequences, and shows that traditional calibration techniques are not sufficiently expressive for this task. Expand
Measuring Calibration in Deep Learning
A comprehensive empirical study of choices in calibration measures including measuring all probabilities rather than just the maximum prediction, thresholding probability values, class conditionality, number of bins, bins that are adaptive to the datapoint density, and the norm used to compare accuracies to confidences. Expand
Uncertainty Quantification and Deep Ensembles
It is demonstrated that, although standard ensembling techniques certainly help to boost accuracy, the calibration of deep-ensembles relies on subtle trade-offs and, crucially, need to be executed after the averaging process. Expand
Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
A large-scale benchmark of existing state-of-the-art methods on classification problems and the effect of dataset shift on accuracy and calibration is presented, finding that traditional post-hoc calibration does indeed fall short, as do several other previous methods. Expand
On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
DNNs trained with mixup are significantly better calibrated and are less prone to over-confident predictions on out-of-distribution and random-noise data, suggesting that mixup be employed for classification tasks where predictive uncertainty is a significant concern. Expand
Predictive Uncertainty Estimation via Prior Networks
This work proposes a new framework for modeling predictive uncertainty called Prior Networks (PNs) which explicitly models distributional uncertainty by parameterizing a prior distribution over predictive distributions and evaluates PNs on the tasks of identifying out-of-distribution samples and detecting misclassification on the MNIST dataset, where they are found to outperform previous methods. Expand
Obtaining Well Calibrated Probabilities Using Bayesian Binning
A new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) is presented which addresses key limitations of existing calibration methods and can be readily combined with many existing classification algorithms. Expand