# Variational Learning of Inducing Variables in Sparse Gaussian Processes

@inproceedings{Titsias2009VariationalLO, title={Variational Learning of Inducing Variables in Sparse Gaussian Processes}, author={Michalis K. Titsias}, booktitle={International Conference on Artificial Intelligence and Statistics}, year={2009} }

Sparse Gaussian process methods that use inducing variables require the selection of the inducing inputs and the kernel hyperparameters. We introduce a variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood. The key property of this formulation is that the inducing inputs are defined to be variational parameters which are selected by minimizing the Kullback-Leibler…

## 1,253 Citations

### Variational Model Selection for Sparse Gaussian Process Regression

- Computer Science
- 2008

A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood.

### Sparse Orthogonal Variational Inference for Gaussian Processes

- Computer Science
- 2019

A new interpretation of sparse variational approximations for Gaussian processes using inducing points which can lead to more scalable algorithms than previous methods and report state-of-the-art results on CIFAR-10 with purely GP-based models.

### Sparse Orthogonal Variational Inference for Gaussian Processes

- Computer ScienceAISTATS
- 2020

A new interpretation of sparse variational approximations for Gaussian processes using inducing points is introduced, which can lead to more scalable algorithms than previous methods and report state-of-the-art results on CIFAR-10 among purely GP-based models.

### Probabilistic selection of inducing points in sparse Gaussian processes

- Computer ScienceUAI
- 2021

This work places a point process prior on the inducing points and approximate the associated posterior through stochastic variational inference and demonstrates how the method can be applied in deep Gaussian processes and latent variable modelling.

### Regularized Variational Sparse Gaussian Processes

- Computer Science
- 2017

This work regularizes the pseudo input estimation toward a statistical summarization of the training inputs in kernel space, and derives a tight variational lower bound, which introduces an additional regularization term of the pseudo inputs and kernel parameters.

### Sparse Gaussian Processes Revisited

- Computer Science
- 2021

This work develops a fully Bayesian approach to scalable gp and deep gp models, and demonstrates its state-of-the-art performance through an extensive experimental campaign across several regression and classiﬁcation problems.

### Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations

- Computer ScienceAISTATS
- 2021

This work shows that, by revisiting old model approximations such as the fully-independent training conditionals endowed with powerful sampling-based inference methods, treating both inducing locations and GP hyper-parameters in a Bayesian way can improve performance significantly.

### Regularization of Sparse Gaussian Processes with Application to Latent Variable Models

- Computer Science
- 2021

This work extends this regularization approach into latent variable models with SGPs and shows that performing variational inference (VI) on those models is equivalent to performing VI on a related empirical Bayes model.

### Bayesian Gaussian Process Latent Variable Model

- Computer ScienceAISTATS
- 2010

A variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction and the maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space.

### Sparse within Sparse Gaussian Processes using Neighbor Information

- Computer ScienceICML
- 2021

This work introduces a novel hierarchical prior, which imposes sparsity on the set of inducing variables and enables the possibility to use sparse GPs using a large number of inducing points without incurring a prohibitive computational cost.

## References

SHOWING 1-10 OF 13 REFERENCES

### Variational Model Selection for Sparse Gaussian Process Regression

- Computer Science
- 2008

A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood.

### Sparse Gaussian Processes using Pseudo-inputs

- Computer ScienceNIPS
- 2005

It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime.

### Sparse Online Gaussian Processes

- Computer Science
- 2008

This work develops an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets by using an appealing parametrisation and projection techniques that use the RKHS norm.

### Fast Forward Selection to Speed Up Sparse Gaussian Process Regression

- Computer ScienceAISTATS
- 2003

A method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection, which leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of hyperparameters automatically.

### Transductive and Inductive Methods for Approximate Gaussian Process Regression

- Computer ScienceNIPS
- 2002

It is found that subset of representers methods can give good and particularly fast predictions for data sets with high and medium noise levels and on complex low noise data sets, the Bayesian committee machine achieves significantly better accuracy, yet at a higher computational cost.

### Bayesian Gaussian process models : PAC-Bayesian generalisation error bounds and sparse approximations

- Computer Science
- 2003

The tractability and usefulness of simple greedy forward selection with information-theoretic criteria previously used in active learning is demonstrated and generic schemes for automatic model selection with many (hyper)parameters are developed.

### Gaussian Processes for Machine Learning

- Computer ScienceAdaptive computation and machine learning
- 2009

The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.

### Sparse On-Line Gaussian Processes

- Computer ScienceNeural Computation
- 2002

An approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets is developed based on a combination of a Bayesian on-line algorithm and a sequential construction of a relevant subsample of data that fully specifies the prediction of the GP model.

### Sparse Greedy Gaussian Process Regression

- Computer ScienceNIPS
- 2000

A simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m, and shows applications to large scale problems.

### Fast Sparse Gaussian Process Methods: The Informative Vector Machine

- Computer ScienceNIPS
- 2002

A framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on information-theoretic principles, which allows for Bayesian model selection and is less complex in implementation is presented.