#### Filter Results:

- Full text PDF available (279)

#### Publication Year

1981

2017

- This year (8)
- Last 5 years (62)
- Last 10 years (153)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Brain Region

#### Cell Type

#### Data Set Used

#### Key Phrases

#### Method

#### Organism

Learn More

- John Shawe-Taylor, Nello Cristianini
- ICTAI
- 2003

rich family of ‘pattern analysis’ algorithms, whose best known element is the Support Vector Machine very general task: given a set of data (any form, not necessarily vectors), find patterns (= any relations). (Examples of relations: classifications, regressions, principal directions, correlations, clusters, rankings, etc....) (Examples of data: gene… (More)

- Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alexander J. Smola, Robert C. Williamson
- Neural Computation
- 2001

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f… (More)

- David R. Hardoon, Sándor Szedmák, John Shawe-Taylor
- Neural Computation
- 2004

We present a general method using kernel canonical correlation analysis to learn a semantic representation to web images and their associated text. The semantic space provides a common representation and enables a comparison between the text and images. In the experiments, we look at two approaches of retrieving images based on only their content from a… (More)

- John C. Platt, Nello Cristianini, John Shawe-Taylor
- NIPS
- 1999

We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many two-class classifiers into a multiclass classifier. For an -class problem, the DDAG contains classifiers, one for each pair of classes. We present a VC analysis of the case when the node classifiers are hyperplanes; the resulting bound on the… (More)

We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously. The subsequences are weighted by an exponentially… (More)

We introduce the notion of kernel-alignment, a measure of similarity between two kernel functions or between a kernel and a target function. This quantity captures the degree of agreement between a kernel and a given learning task, and has very natural interpretations in machine learning, leading also to simple algorithms for model selection and learning.… (More)

- Ayhan Demiriz, Kristin P. Bennett, John Shawe-Taylor
- Machine Learning
- 2002

We examine linear program (LP) approaches to boosting and demonstrate their efficient solution using LPBoost, a column generation based simplex method. We formulate the problem as if all possible weak hypotheses had already been generated. The labels produced by the weak hypotheses become the new feature space of the problem. The boosting task becomes to… (More)

Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified between and . We propose a method to approach this problem by trying to estimate a function f which is… (More)

- John Shawe-Taylor, Peter L. Bartlett, Robert C. Williamson, Martin Anthony
- IEEE Trans. Information Theory
- 1998

The paper introduces some generalizations of Vapnik’s method of structural risk minimisation (SRM). As well as making explicit some of the details on SRM, it provides a result that allows one to trade off errors on the training sample against improved generalization performance. It then considers the more general case when the hierarchy of classes is chosen… (More)

Kernel methods make it relatively easy to define complex highdimensional feature spaces. This raises the question of how we can identify the relevant subspaces for a particular learning task. When two views of the same phenomenon are available kernel Canonical Correlation Analysis (KCCA) has been shown to be an effective preprocessing step that can improve… (More)