Learning Curves for Gaussian Process Regression: Approximations and Bounds

@article{Sollich2002LearningCF,
  title={Learning Curves for Gaussian Process Regression: Approximations and Bounds},
  author={Peter Sollich and Anason S. Halees},
  journal={Neural Computation},
  year={2002},
  volume={14},
  pages={1393-1428}
}
We consider the problem of calculating learning curves (i.e., average generalization performance) of gaussian processes used for regression. On the basis of a simple expression for the generalization error, in terms of the eigenvalue decomposition of the covariance function, we derive a number of approximation schemes. We identify where these become exact and compare with existing bounds on learning curves; the new approximations, which can be used for any input space dimension, generally get… 
Asymptotic analysis of the learning curve for Gaussian process regression
TLDR
The main result is the proof of a theorem giving the generalization error for a large class of correlation kernels and for any dimension when the number of observations is large.
Regularity dependence of the rate of convergence of the learning curve for Gaussian process regression
TLDR
The presented proof generalizes previous ones that were limited to more specific kernels or to small dimensions (one or two) and can be used to build an optimal strategy for resources allocation.
Learning curves for multi-task Gaussian process regression
TLDR
It is demonstrated that when learning many tasks, the learning curves separate into an initial phase, where the Bayes error on each task is reduced down to a plateau value by "collective learning" even though most tasks have not seen examples, and a final decay that occurs once the number of examples is proportional to thenumber of tasks.
Posterior Variance Analysis of Gaussian Processes with Application to Average Learning Curves
TLDR
A novel bound is derived for the posterior variance function which requires only local information because it depends only on the number of training samples in the proximity of a considered test point and it is demonstrated that the extension of the bound to an average learning bound outperforms existing approaches.
Continuous-Space Gaussian Process Regression and Generalized Wiener Filtering with Application to Learning Curves
TLDR
The general continuous-space Gaussian process regression equations are presented and their close connection with Wiener filtering is discussed and the results to estimation of learning curves as functions of training set size and input dimensionality are applied.
Generalization Errors and Learning Curves for Regression with Multi-task Gaussian Processes
TLDR
The asymmetric two-tasks case, where a secondary task is to help the learning of a primary task, is analyzed, and bounds on the generalization error and the learning curve of the primary task are given.
Exact learning curves for Gaussian process regression on large random graphs
TLDR
It is shown that for discrete input domains, where similarity between input points is characterised in terms of a graph, accurate predictions can be obtained and should in fact become exact for large graphs drawn from a broad range of random graph ensembles with arbitrary degree distributions.
Learning curves for Gaussian process regression with power-law priors and targets
TLDR
It is shown that the generalization error of kernel ridge regression (KRR) has the same asymptotics as well as that of Gaussian process regression (GPR) when the eigenspectrum of the prior and the eigenexpansion coefficients of the target function decay with rate β.
Learning Curves for Gaussian Processes via Numerical Cubature Integration
TLDR
An approach where the recursion equations for the generalization error are approximately solved using numerical cubature integration methods where the eigenfunction expansion of the covariance function does not need to be known.
Replica theory for learning curves for Gaussian processes on random graphs
TLDR
It is shown that replica techniques can be used to obtain exact performance predictions in the limit of large graphs, after first rewriting the average error in terms of a graphical model.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 36 REFERENCES
Approximate learning curves for Gaussian processes
TLDR
The problem of calculating learning curves of Gaussian processes used for regression is considered, and a simple expression for the generalization error in terms of the eigenvalue decomposition of the covariance function is derived.
Learning Curves for Gaussian Processes
I consider the problem of calculating learning curves (i.e., average generalization performance) of Gaussian processes used for regression. A simple expression for the generalization error in terms
Upper and Lower Bounds on the Learning Curve for Gaussian Processes
TLDR
An explanation of the early, linearly-decreasing behavior of the learning curves and the bounds as well as a study of the asymptotic behaviors of the curves are presented.
Learning Curves for Gaussian Processes Regression: A Framework for Good Approximations
TLDR
A method for approximately computing average case learning curves for Gaussian process regression models and explains how the approximation can be systematically improved and argues that similar techniques can be applied to general likelihood models.
Gaussian regression and optimal finite dimensional linear models
The problem of regression under Gaussian assumptions is treated generally. The relationship between Bayesian prediction, regularization and smoothing is elucidated. The ideal regression is the
General Bounds on Bayes Errors for Regression with Gaussian Processes
TLDR
Based on a simple convexity lemma, bounds for different types of Bayesian prediction errors for regression with Gaussian processes are developed, yielding asymptotically tight results.
Finite-Dimensional Approximation of Gaussian Processes
TLDR
This work derives optimal finite-dimensional predictors under a number of assumptions, and shows the superiority of these predictors over the Projected Bayes Regression method (which is asymptotically optimal).
Regression with Input-dependent Noise: A Gaussian Process Treatment
TLDR
This paper shows that prior uncertainty about the parameters controlling both processes can be handled and that the posterior distribution of the noise rate can be sampled from using Markov chain Monte Carlo methods and gives a posterior noise variance that well-approximates the true variance.
On average case complexity of linear problems with noisy information
  • L. Plaskota
  • Mathematics, Computer Science
    J. Complex.
  • 1990
Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond
The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a
...
1
2
3
4
...