• Corpus ID: 17404261

Fast Forward Selection to Speed Up Sparse Gaussian Process Regression

  title={Fast Forward Selection to Speed Up Sparse Gaussian Process Regression},
  author={Matthias W. Seeger and Christopher K. I. Williams and Neil D. Lawrence},
We present a method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection. Our method is essentially as fast as an equivalent one which selects the "support" patterns at random, yet it can outperform random selection on hard curve fitting tasks. More importantly, it leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of… 

Figures and Tables from this paper

Efficient Nonparametric Bayesian Modelling with Sparse Gaussian Process Approximations
A general framework based on the informative vector machine (IVM) is presented and it is shown how the complete Bayesian task of inference and learning of free hyperparameters can be performed in a practically efficient manner.
Efficient sparsification for Gaussian process regression
Sparse variational inference for generalized Gaussian process models
A variational sparse solution for GPs under general likelihoods is developed by providing a new characterization of the gradients required for inference in terms of individual observation likelihood terms and demonstrating experimentally that the fixed point operator acts as a contraction in many cases and therefore leads to fast convergence.
A Support Set Selection Algorithm for Sparse Gaussian Process Regression
  • Xinlu Guo, K. Uehara
  • Computer Science
    2015 IIAI 4th International Congress on Advanced Applied Informatics
  • 2015
A new selection criterion based on residual sum of squares to score the importance of training data and then update the support set iteratively according to this score is described, however, the iterative updating procedure has high time complexity due to the re-computing of matrix.
Fast Allocation of Gaussian Process Experts
A scalable nonparametric Bayesian regression model based on a mixture of Gaussian process experts and the inducing points formalism underpinning sparse GP approximations that significantly outperforms six competitive baselines while requiring only a few hours of training.
Efficient Optimization for Sparse Gaussian Process Regression
An efficient optimization algorithm to select a subset of training data as the inducing set for sparse Gaussian process regression using a single objective and can be used to optimize either the marginal likelihood or a variational free energy.
Sparse Gaussian Process regression model based on ℓ1/2 regularization
A new sparse GP model is developed, referred to as GPHalf, which represents the GP as a generalized linear regression model, then uses the modified ℓ1/2 half thresholding algorithm to optimize the corresponding objective function, thus yielding a sparse GPModel, proof that the proposed model converges to a sparse solution.
Sparse Gaussian Processes using Pseudo-inputs
It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime.
Noise Estimation in Gaussian Process Regression
We develop a computational procedure to estimate the covariance hyperparameters for semiparametric Gaussian process regression models with additive noise. Namely, the presented method can be used to
Sparse Spectrum Gaussian Process Regression
The achievable trade-offs between predictive accuracy and computational requirements are compared, and it is shown that these are typically superior to existing state-of-the-art sparse approximations.


Fast Sparse Gaussian Process Methods: The Informative Vector Machine
A framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on information-theoretic principles, which allows for Bayesian model selection and is less complex in implementation is presented.
Sparse On-Line Gaussian Processes
An approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets is developed based on a combination of a Bayesian on-line algorithm and a sequential construction of a relevant subsample of data that fully specifies the prediction of the GP model.
Sparse Greedy Gaussian Process Regression
A simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m, and shows applications to large scale problems.
Gaussian processes:iterative sparse approximations
This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior of Gaussian processes, and combines the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution.
Evaluation of gaussian processes and other methods for non-linear regression
It is shown that a Bayesian approach to learning in multi-layer perceptron neural networks achieves better performance than the commonly used early stopping procedure, even for reasonably short amounts of computation time.
Hybrid Adaptive Splines
Abstract An adaptive spline method for smoothing is proposed that combines features from both regression spline and smoothing spline approaches. One of its advantages is the ability to vary the
A Bayesian Committee Machine
It is found that the performance of the BCM improves if several test points are queried at the same time and is optimal if the number of test points is at least as large as the degrees of freedom of the estimator.
TAP Gibbs Free Energy, Belief Propagation and Sparsity
The adaptive TAP Gibbs free energy for a general densely connected probabilistic model with quadratic interactions and arbritary single site constraints is derived. We show how a specific sequential
Query by committee
It is suggested that asymptotically finite information gain may be an important characteristic of good query algorithms, in which a committee of students is trained on the same data set.
Support Vector Machine Active Learning with Applications to Text Classification
Experimental results showing that employing the active learning method can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings are presented.