# Attended Temperature Scaling: A Practical Approach for Calibrating Deep Neural Networks

@article{Mozafari2018AttendedTS, title={Attended Temperature Scaling: A Practical Approach for Calibrating Deep Neural Networks}, author={Azadeh Sadat Mozafari and Hugo Siqueira Gomes and Wilson Le{\~a}o and Steeven Janny and Christian Gagn'e}, journal={arXiv: Learning}, year={2018} }

Recently, Deep Neural Networks (DNNs) have been achieving impressive results on wide range of tasks. However, they suffer from being well-calibrated. In decision-making applications, such as autonomous driving or medical diagnosing, the confidence of deep networks plays an important role to bring the trust and reliability to the system. To calibrate the deep networks' confidence, many probabilistic and measure-based approaches are proposed. Temperature Scaling (TS) is a state-of-the-art among…

## Figures, Tables, and Topics from this paper

## 6 Citations

Class-Distribution-Aware Calibration for Long-Tailed Visual Recognition

- Computer ScienceArXiv
- 2021

This study proposes class-distribution-aware TS (CDATS) and LS (CDA-LS) by incorporating class frequency information in model calibration in the context of long-tailed distribution to accommodate the imbalanced data distribution yielding superior performance in both calibration error and predictive accuracy.

Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification

- Computer Science2020 25th International Conference on Pattern Recognition (ICPR)
- 2021

It is demonstrated that Temperature Scaling (TS), a recently introduced re-calibration technique, can be successfully applied to immunofluorescence classification in renal biopsy, and that TS is able to provide reliable probabilities, which are highly valuable for such a task given the low inter-rater agreement.

Local Temperature Scaling for Probability Calibration

- Computer Science
- 2020

This work proposes a learning-based calibration method that focuses on multi-label semantic segmentation, and adopts a tree-like convolution neural network to predict local temperature values for probability calibration.

Comparison of Recursive Neural Network and Markov Chain Models in Facies Inversion

- Computer ScienceMathematical Geosciences
- 2021

An innovative approach integrating recursive neural networks and the state-of-the-art seismic to facies inversion, known as the convolutional hidden Markov model, is proposed in order to predict geologically more realistic facies sequences based on seismic data.

Criticality: A New Concept of Severity of Illness for Hospitalized Children.

- MedicinePediatric critical care medicine : a journal of the Society of Critical Care Medicine and the World Federation of Pediatric Intensive and Critical Care Societies
- 2020

The Criticality Index is a quantification of severity of illness for hospitalized children using physiology, therapy, and care intensity and is applicable to clinical investigations and predicting future care needs.

Multi-Loss Sub-Ensembles for Accurate Classification with Uncertainty Estimation

- Computer Science, MathematicsArXiv
- 2020

This work proposes an efficient method for uncertainty estimation in DNNs achieving high accuracy, and simulates the notion of multi-task learning on single-task problems by producing parallel predictions from similar models differing by their loss.

## References

SHOWING 1-10 OF 51 REFERENCES

A Scalable Laplace Approximation for Neural Networks

- Computer ScienceICLR
- 2018

This work uses recent insights from second-order optimisation for neural networks to construct a Kronecker factored Laplace approximation to the posterior over the weights of a trained network, enabling practitioners to estimate the uncertainty of models currently used in production without having to retrain them.

Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings

- Computer ScienceICML
- 2018

MMCE is presented, a RKHS kernel based measure of calibration that is efficiently trainable alongside the negative likelihood loss without careful hyperparameter tuning, and whose finite sample estimates are consistent and enjoy fast convergence rates.

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

- Mathematics, Computer ScienceNIPS
- 2017

This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates.

Wide Residual Networks

- Computer ScienceBMVC
- 2016

This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts.

On Calibration of Modern Neural Networks

- Computer Science, MathematicsICML
- 2017

It is discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated, and on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions.

Bayesian Dark Knowledge

- 2015

We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior…

Overcoming catastrophic forgetting in neural networks

- Medicine, Computer ScienceProceedings of the National Academy of Sciences
- 2017

It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

- Mathematics, Computer ScienceICML
- 2016

A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.

Bayesian dark knowledge

- Computer ScienceNIPS
- 2015

This work describes a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network.

Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

- Computer Science, MathematicsICLR
- 2018

The proposed ODIN method, based on the observation that using temperature scaling and adding small perturbations to the input can separate the softmax score distributions between in- and out-of-distribution images, allowing for more effective detection, consistently outperforms the baseline approach by a large margin.