# Meta-Calibration: Meta-Learning of Model Calibration Using Differentiable Expected Calibration Error

@article{Bohdal2021MetaCalibrationMO, title={Meta-Calibration: Meta-Learning of Model Calibration Using Differentiable Expected Calibration Error}, author={Ondrej Bohdal and Yongxin Yang and Timothy M. Hospedales}, journal={ArXiv}, year={2021}, volume={abs/2106.09613} }

Calibration of neural networks is a topical problem that is becoming increasingly important for real-world use of neural networks. The problem is especially noticeable when using modern neural networks, for which there is significant difference between the model confidence and the confidence it should have. Various strategies have been successfully proposed, yet there is more space for improvements. We propose a novel approach that introduces a differentiable metric for expected calibration… Expand

#### References

SHOWING 1-10 OF 25 REFERENCES

On Calibration of Modern Neural Networks

- Computer Science, Mathematics
- ICML
- 2017

It is discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated, and on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions. Expand

Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings

- Computer Science
- ICML
- 2018

MMCE is presented, a RKHS kernel based measure of calibration that is efficiently trainable alongside the negative likelihood loss without careful hyperparameter tuning, and whose finite sample estimates are consistent and enjoy fast convergence rates. Expand

Measuring Calibration in Deep Learning

- Computer Science, Mathematics
- CVPR Workshops
- 2019

A comprehensive empirical study of choices in calibration measures including measuring all probabilities rather than just the maximum prediction, thresholding probability values, class conditionality, number of bins, bins that are adaptive to the datapoint density, and the norm used to compare accuracies to confidences. Expand

Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters

- Computer Science, Mathematics
- ICML
- 2016

The approach for tuning regularization hyperparameters is explored and it is found that in experiments on MNIST, SVHN and CIFAR-10, the resulting regularization levels are within the optimal regions. Expand

Optimizing Millions of Hyperparameters by Implicit Differentiation

- Computer Science, Mathematics
- AISTATS
- 2020

An algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations is proposed and used to train modern network architectures with millions of weights and millions of hyper-parameters. Expand

MetaReg: Towards Domain Generalization using Meta-Regularization

- Computer Science
- NeurIPS
- 2018

Experimental validations on computer vision and natural language datasets indicate that the encoding of the notion of domain generalization using a novel regularization function using a Learning to Learn (or) meta-learning framework can learn regularizers that achieve good cross-domain generalization. Expand

Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration

- Computer Science, Mathematics
- NeurIPS
- 2019

A natively multiclass calibration method applicable to classifiers from any model class, derived from Dirichlet distributions and generalising the beta calibration method from binary classification is proposed. Expand

Obtaining Well Calibrated Probabilities Using Bayesian Binning

- Computer Science, Medicine
- AAAI
- 2015

A new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) is presented which addresses key limitations of existing calibration methods and can be readily combined with many existing classification algorithms. Expand

Adam: A Method for Stochastic Optimization

- Computer Science, Mathematics
- ICLR
- 2015

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand

When Does Label Smoothing Help?

- Computer Science, Mathematics
- NeurIPS
- 2019

It is shown empirically that in addition to improving generalization, label smoothing improves model calibration which can significantly improve beam-search and that if a teacher network is trained with label smoothed, knowledge distillation into a student network is much less effective. Expand