#### Filter Results:

#### Publication Year

1994

2016

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

- Zoubin Ghahramani, Thomas L Griffiths, Peter Sollich, Z Ghahramani, T L Griffiths, P Sollich
- 2007

We describe a flexible nonparametric approach to latent variable modelling in which the number of latent variables is unbounded. This approach is based on a probability distribution over equivalence classes of binary matrices with a finite number of rows, corresponding to the data points, and an unbounded number of columns, corresponding to the latent… (More)

I describe a framework for interpreting Support Vector Machines (SVMs) as maximum a posteriori (MAP) solutions to inference problems with Gaussian Process priors. This probabilistic interpretation can provide intuitive guidelines for choosing a 'good' SVM kernel. Beyond this, it allows Bayesian methods to be used for tackling two of the outstanding… (More)

We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fit the training data. Globally optimal performance can be obtained by choosing the… (More)

I consider the problem of calculating learning curves (i.e., average generalization performance) of Gaussian processes used for regression. A simple expression for the generalization error in terms of the eigenvalue decomposition of the covariance function is derived, and used as the starting point for several approximation schemes. I identify where these… (More)

We consider the problem of calculating learning curves (i.e., average generalization performance) of gaussian processes used for regression. On the basis of a simple expression for the generalization error, in terms of the eigenvalue decomposition of the covariance function, we derive a number of approximation schemes. We identify where these become exact… (More)

We analyze online gradient descent learning from finite training sets at noninfinitesimal learning rates eta. Exact results are obtained for the time-dependent generalization error of a simple model system: a linear network with a large number of weights N, trained on p = alphaN examples. This allows us to study in detail the effects of finite training set… (More)

We address the problem of model selection for Support Vector Machine (SVM) classification. For fixed functional form of the kernel, model selection amounts to tuning kernel parameters and the slack penalty coefficient C. We begin by reviewing a recently developed probabilistic framework for SVM classification. An extension to the case of SVMs with quadratic… (More)

A Bayesian point of view of SVM classifiers allows the definition of a quantity analogous to the evidence in probabilistic models. By maximizing this one can systematically tune hyperparameters and, via automatic relevance determination (ARD), select relevant input features. Evidence gradients are expressed as averages over the associated posterior and can… (More)

The equivalent kernel [1] is a way of understanding how Gaussian process regression works for large sample sizes based on a continuum limit. In this paper we show (1) how to approximate the equivalent kernel of the widely-used squared exponential (or Gaussian) kernel and related kernels , and (2) how analysis using the equivalent kernel helps to understand… (More)