Interpretable Uncertainty Quantification in AI for HEP

@article{Chen2022InterpretableUQ,
  title={Interpretable Uncertainty Quantification in AI for HEP},
  author={Thomas Y. Chen and Biprateep Dey and Aishik Ghosh and Michael Kagan and Brian Nord and Nesar S. Ramachandra},
  journal={ArXiv},
  year={2022},
  volume={abs/2208.03284}
}
Estimating uncertainty is at the core of performing scientific measurements in HEP: a measurement is not useful without an estimate of its uncertainty. The goal of uncertainty quantification (UQ) is inextricably linked to the question, “how do we physically and statistically interpret these uncertainties?” The answer to this question depends not only on the computational task we aim to undertake, but also on the methods we use for that task. For artificial intelligence (AI) applications in HEP… 

Calibrated Predictive Distributions via Diagnostics for Conditional Coverage

This work shows that recalibration, as well as diagnostics of entire PDs, are indeed attainable goals in practice and produces calibrated PDs for two applications: probabilistic nowcasting based on sequences of satellite images, and estimation of galaxy distances based on imaging data (photometric redshifts).

References

SHOWING 1-10 OF 60 REFERENCES

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

This paper provides an introduction to the topic of uncertainty in machine learning as well as an overview of attempts so far at handling uncertainty in general and formalizing this distinction in particular.

Diagnostics for conditional density models and Bayesian inference algorithms

This paper presents rigorous and easy-to-interpret diagnostics such as the “Local Coverage Test” (LCT), which distinguishes an arbitrarily misspecified model from the true conditional density of the sample, and “Amortized Local P-P plots’ (ALP) which can quickly provide interpretable graphical summaries of distributional differences at any location x in the feature space.

Aleatory or epistemic? Does it matter?

A cautionary tale of decorrelating theory uncertainties

Examples of two-point (fragmentation modeling) and continuous (higher-order corrections) uncertainties where decorrelating significantly reduces the apparent uncertainty while the true uncertainty is much larger are provided.

Calibrated Predictive Distributions via Diagnostics for Conditional Coverage

This work shows that recalibration, as well as diagnostics of entire PDs, are indeed attainable goals in practice and produces calibrated PDs for two applications: probabilistic nowcasting based on sequences of satellite images, and estimation of galaxy distances based on imaging data (photometric redshifts).

Evaluation of Uncertainty Quantification in Deep Learning

Four different methods that have been proposed to correctly quantifying uncertainty when the AI model is faced with new samples are evaluated and it is found that they capture the uncertainty differently and the correlation between the quantified uncertainty of the models is low.

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates.

Mining gold from implicit models to improve likelihood-free inference

Inference techniques for this case are presented that combine the insight that additional latent information can be extracted from the simulator with the power of neural networks in regression and density estimation tasks, leading to better sample efficiency and quality of inference.

Likelihood-Free Frequentist Inference: Bridging Classical Statistics and Machine Learning in Simulation and Uncertainty Quantification

This paper presents a statistical framework for LFI that unifies classical statistics with modern machine learning to construct frequentist confidence sets and hypothesis tests with finite-sample guarantees of nominal coverage and rigorous diagnostics for assessing empirical coverage over the entire parameter space.

Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures

This study provides the first large-scale evaluation of the empirical frequentist coverage properties of well-known uncertainty quantification techniques on a suite of regression and classification tasks and finds that, in general, some methods do achieve desirable coverage properties on in distribution samples, but that coverage is not maintained on out-of-distribution data.
...