# Contraction rates for sparse variational approximations in Gaussian process regression

@inproceedings{Nieman2021ContractionRF, title={Contraction rates for sparse variational approximations in Gaussian process regression}, author={Dennis Nieman and Botond Szab{\'o} and Harry van Zanten}, year={2021} }

We study the theoretical properties of a variational Bayes method in the Gaussian Process regression model. We consider the inducing variables method introduced by Titsias (2009b) and derive suﬃcient conditions for obtaining contraction rates for the corresponding variational Bayes (VB) posterior. As examples we show that for three particular covariance kernels (Matérn, squared exponential, random series prior) the VB approach can achieve optimal, minimax contraction rates for a suﬃciently…

## 7 Citations

### Posterior contraction for deep Gaussian process priors

- Computer Science, Mathematics
- 2021

It is shown that the contraction rates can achieve the minimax convergence rate (up to log n factors), while being adaptive to the underlying structure and smoothness of the target function.

### Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

- Computer ScienceICML
- 2022

Novel confidence intervals are provided for the Nystr ¨ om method and the sparse variational Gaussian process approximation method, which are established using novel interpretations of the approximate (surrogate) posterior variance of the models.

### Optimal recovery and uncertainty quantification for distributed Gaussian process regression

- Computer Science
- 2022

This work derives frequentist theoretical guarantees and limitations for a range of distributed methods for general GP priors in context of the nonparametric regression model, both for recovery and uncertainty quantiﬁcation.

### Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees

- Computer ScienceArXiv
- 2022

This work study the numerical stability of scalable sparse approximations based on inducing points and proposes an automated method for computing inducing points satisfying these conditions, showing that, in geospatial settings, sparse approximation with guaranteed numerical stability often perform comparably to those without.

### Ultra-fast Deep Mixtures of Gaussian Process Experts

- Computer ScienceArXiv
- 2020

This article proposes to design the gating network for selecting the experts from such mixtures of sparse GPs using a deep neural network (DNN) which provides a flexible, robust, and efficient model which is able to significantly outperform competing models.

### Fast Deep Mixtures of Gaussian Process Experts

- Computer Science
- 2020

This article proposes to design the gating network for selecting the experts from such mixtures of sparse GPs using a deep neural network (DNN) and a fast one pass algorithm called Cluster-Classify-Regress (CCR) is leveraged to approximate the maximum a posteriori (MAP) estimator extremely quickly.

### Scalable Variational Bayes methods for Hawkes processes

- Computer Science
- 2022

Multivariate Hawkes processes are temporal point processes extensively applied to model event data with dependence on past occurrences and interaction phenomena, e.g., neuronal spike trains, online…

## References

SHOWING 1-10 OF 27 REFERENCES

### Variational Model Selection for Sparse Gaussian Process Regression

- Computer Science
- 2008

A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood.

### Variational Learning of Inducing Variables in Sparse Gaussian Processes

- Computer ScienceAISTATS
- 2009

A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood.

### Rates of contraction of posterior distributions based on Gaussian process priors

- Mathematics, Computer Science
- 2008

The rate of contraction of the posterior distribution based on sampling from a smooth density model when the prior models the log density as a (fractionally integrated) Brownian motion is shown to depend on the position of the true parameter relative to the reproducing kernel Hilbert space of the Gaussian process.

### Variational Bayes for High-Dimensional Linear Regression With Sparse Priors

- Computer Science
- 2019

A mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression is studied, showing that it works comparably well as other state-of-the-art Bayesian variable selection methods.

### Variational Fourier Features for Gaussian Processes

- Computer ScienceJ. Mach. Learn. Res.
- 2017

This work hinges on a key result that there exist spectral features related to a finite domain of the Gaussian process which exhibit almost-independent covariances, and derives these expressions for Matern kernels in one dimension, and generalize to more dimensions using kernels with specific structures.

### Information Rates of Nonparametric Gaussian Process Methods

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2011

The results show that for good performance, the regularity of the GP prior should match the regularities of the unknown response function, and is expressible in a certain concentration function.

### On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes

- Computer ScienceAISTATS
- 2016

A substantial generalization of the literature on variational framework for learning inducing variables is given and a new proof of the result for infinite index sets is given which allows inducing points that are not data points and likelihoods that depend on all function values.

### Convergence of Sparse Variational Inference in Gaussian Processes Regression

- Computer ScienceJ. Mach. Learn. Res.
- 2020

It is shown that the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with M needs to grow with N to ensure high quality approximations.

### Rates of Convergence for Sparse Variational Gaussian Process Regression

- Computer ScienceICML
- 2019

The results show that as datasets grow, Gaussian process posteriors can truly be approximated cheaply, and provide a concrete rule for how to increase $M$ in continual learning scenarios.

### Reproducing kernel Hilbert spaces of Gaussian priors

- Mathematics, Computer Science
- 2008

We review definitions and properties of reproducing kernel Hilbert spaces attached to Gaussian variables and processes, with a view to applications in nonparametric Bayesian statistics using Gaussian…