# Uncertainty Quantification in the Classification of High Dimensional Data

@article{Bertozzi2017UncertaintyQI, title={Uncertainty Quantification in the Classification of High Dimensional Data}, author={A. Bertozzi and Xiyang Luo and Andrew M. Stuart and Konstantinos C. Zygalakis}, journal={ArXiv}, year={2017}, volume={abs/1703.08816} }

Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of, a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The…

## 18 Citations

### Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms

- Computer ScienceApplied and Computational Harmonic Analysis
- 2020

### Analysis of $p$-Laplacian Regularization in Semi-Supervised Learning

- Computer Science, MathematicsSIAM J. Math. Anal.
- 2019

A new model is introduced which is as simple as the original model, but overcomes this restriction, and it is proved that the minimizers of the discrete functionals in random setting converge uniformly to the desired continuum limit.

### Uncertainty quantification for semi-supervised multi-class classification in image processing and ego-motion analysis of body-worn videos

- Computer ScienceImage Processing: Algorithms and Systems
- 2019

This paper introduces an uncertainty quantiﬁcation (UQ) method for graph-based semi-supervised multi-class classi-cation problems, which not only predicts the class label for each data point, but also provides a con-dence score for the prediction.

### Gaussian Process Landmarking on Manifolds

- Computer ScienceSIAM J. Math. Data Sci.
- 2019

This work provides an asymptotic analysis for the decay of the maximum conditional variance, which is frequently employed as a greedy criterion for similar variance- or uncertainty-based sequential experimental design strategies, and is the first result of this type for experimental design.

### Continuum Limits of Posteriors in Graph Bayesian Inverse Problems

- Mathematics, Computer ScienceSIAM J. Math. Anal.
- 2018

A graph-based Bayesian inverse problem is introduced, and it is shown that the graph-posterior measures over functions in $M_n$ converge, in the large $n$ limit, to a posterior over Functions in M that solves a Bayesian inverted problem with known domain.

### The Bayesian Update: Variational Formulations and Gradient Flows

- Computer ScienceBayesian Analysis
- 2020

It is shown that the rate of convergence of the flows to the posterior can be bounded by the geodesic convexity of the functional to be minimized, and this observation is used to propose a criterion for the choice of metric in Riemannian MCMC methods.

### Ensemble Kalman inversion: a derivative-free technique for machine learning tasks

- Computer ScienceInverse Problems
- 2019

An efficient, gradient-free algorithm for finding a solution to classical inverse or filtering problems using ensemble Kalman inversion (EKI), which is inherently parallelizable and is applicable to problems with non-differentiable loss functions, for which back-propagation is not possible.

### Data Based Construction of Kernels for Classification

- Computer Science
- 2020

This paper constructs a datadependent kernel utilizing the components of the eigen-decompositions of different kernels constructed using ideas from diffusion geometry, and uses a regularization technique with this kernel with adaptively chosen parameters.

### Nonparametric Bayesian label prediction on a large graph using truncated Laplacian regularization

- Mathematics, Computer ScienceCommun. Stat. Simul. Comput.
- 2021

An implementation of a nonparametric Bayesian approach to solving binary classification problems on graphs with a prior constructed by truncating a series expansion of the soft label function using the graph Laplacian eigenfunctions as basisfunctions is described.

### Variational Limits of k-NN Graph-Based Functionals on Data Clouds

- Mathematics, Computer ScienceSIAM J. Math. Data Sci.
- 2019

It is rigorously show that provided the number of neighbors in the graph $k:=k_n$ scales with theNumber of points in the cloud as $n \gg k-NN \gg \log(n)$, then the solution to the graph cut optimization problem converges towards the solution of an analogue variational problem at the continuum level.

## References

SHOWING 1-10 OF 53 REFERENCES

### Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification

- Computer ScienceNIPS
- 2005

A Bayesian framework for learning hyperparameters for graph-based semi-supervised classification and shows that the posterior mean can be written in terms of the kernel matrix, providing a Bayesian classifier to classify new points.

### Regression and Classification Using Gaussian Process Priors

- Computer Science
- 2009

Gaussian processes are in my view the simplest and most obvious way of defining flexible Bayesian regression and classification models, but despite some past usage, they appear to have been rather neglected as a general-purpose technique.

### Regression and Classification Using Gaussian Process Priors

- Computer Science
- 2009

Gaussian processes are in my view the simplest and most obvious way of defining flexible Bayesian regression and classification models, but despite some past usage, they appear to have been rather neglected as a general-purpose technique.

### On the Brittleness of Bayesian Inference

- Computer ScienceSIAM Rev.
- 2015

It is reported that, although Bayesian methods are robust when the number of possible outcomes is finite or when only a finite number of marginals of the data-generating distribution are unknown, they could be generically brittle when applied to continuous systems with finite information.

### Diffuse Interface Models on Graphs for Classification of High Dimensional Data

- Computer ScienceSIAM Rev.
- 2016

This work develops a class of variational algorithms that combine recent ideas from spectral methods on graphs with nonlinear edge/region detection methods traditionally used in the PDE-based imaging community based on the Ginzburg--Landau functional.

### Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

- Computer ScienceJ. Mach. Learn. Res.
- 2006

A semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner is proposed and properties of reproducing kernel Hilbert spaces are used to prove new Representer theorems that provide theoretical basis for the algorithms.

### Multiclass Data Segmentation Using Diffuse Interface Methods on Graphs

- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2014

Two graph-based algorithms for multiclass segmentation of high-dimensional data on graphs using a graph adaptation of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates between diffusion and thresholding are presented.

### Bayesian image classification using Markov random fields

- Computer ScienceImage Vis. Comput.
- 1996

### Optimal Uncertainty Quantification

- Computer ScienceSIAM Rev.
- 2013

A general algorithmic framework is developed for OUQ and is tested on the Caltech surrogate model for hypervelocity impact and on the seismic safety assessment of truss structures, suggesting the feasibility of the framework for important complex systems.

### MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster

- Computer Science
- 2013

An approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.