#### Filter Results:

- Full text PDF available (88)

#### Publication Year

1962

2017

#### Publication Type

#### Co-author

#### Publication Venue

#### Data Set Used

#### Key Phrases

Learn More

- Christopher M. Bishop, Nasser M. Nasrabadi
- J. Electronic Imaging
- 2007

This beautifully produced book is intended for advanced undergraduates, PhD students, and researchers and practitioners, primarily in machine learning or allied areas. The theoretical framework is, as far as possible, that of Bayesian decision theory, taking advantage of the computational tools now available for practical implementation of such methods.… (More)

Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor… (More)

- M E Tipping, C M Bishop
- Neural computation
- 1999

Principal component analysis (PCA) is one of the most popular techniques for processing, compressing, and visualizing data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However,… (More)

- Michael E. Tipping, Christopher M. Bishop
- Neural Computation
- 1999

Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However,… (More)

- Christopher M. Bishop, Markus Svensén, Christopher K. I. Williams
- Neural Computation
- 1998

Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis, which is based on a linear transformation between the latent space and the data space. In this article, we introduce a form of nonlinear latent variable model… (More)

One of the central issues in the use of principal component analysis (PCA) for data modelling is that of choosing the appropriate number of retained components. This problem was recently addressed through the formulation of a Bayesian treatment of PCA (Bishop, 1999a) in terms of a probabilistic latent variable model. A central feature of this approach is… (More)

- John M. Winn, Christopher M. Bishop
- Journal of Machine Learning Research
- 2005

Bayesian inference is now widely established as one of the principal foundations for machine learning. In practice, exact inference is rarely possible, and so a variety of approximation techniques have been developed, one of the most widely used being a deterministic framework called varia-tional inference. In this paper we introduce Variational Message… (More)

The Support Vector Machine (SVM) of Vap-nik [9] has become widely established as one of the leading approaches to pattern recognition and machine learning. It expresses predictions in terms of a linear combination of kernel functions centred on a subset of the training data, known as support vectors. Despite its widespread success, the SVM suffers from some… (More)

Mixture models, in which a probability distribution is represented as a linear superposition of component distributions, are widely used in statistical modelling and pattern recognition. One of the key tasks in the application of mixture models is the determination of a suitable number of components. Conventional approaches based on cross-validation are… (More)

- Christopher M. Bishop, Markus Svensén
- UAI
- 2003

The Hierarchical Mixture of Experts (HME) is a well-known tree-structured model for regression and classification, based on soft probabilis-tic splits of the input space. In its original formulation its parameters are determined by maximum likelihood, which is prone to severe over-fitting, including singularities in the likelihood function. Furthermore the… (More)