Olivier Chapelle

Learn More
While numerous metrics for information retrieval are available in the case of binary relevance, there is only one commonly used metric for graded relevance, namely the Discounted Cumulative Gain (DCG). A drawback of DCG is its additive nature and the underlying independence assumption: a document in a given position has always the same gain and discount(More)
The problem of automatically tuning multiple parameters for pattern recognition Support Vector Machines (SVMs) is considered. This is done by minimizing some estimates of the generalization error of SVMs using a gradient descent algorithm over the set of parameters. Usual methods for choosing parameters, based on exhaustive search become intractable as soon(More)
We believe that the cluster assumption is key to successful semi-supervised learning. Based on this, we propose three semi-supervised algorithms: 1. deriving graph-based distances that emphazise low density regions between clusters, followed by training a standard SVM; 2. optimizing the Transductive SVM objective function, which places the decision boundary(More)
We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life(More)
Most literature on support vector machines (SVMs) concentrates on the dual optimization problem. In this letter, we point out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility. On the contrary, from the primal point of view, new families of algorithms for(More)
Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM's) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x,(More)
We present a system and a set of techniques for learning linear predictors with convex losses on terascale datasets, with trillions of features, billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. Individually none of the component techniques is new, but the careful synthesis required to obtain an efficient(More)
Learning to rank for information retrieval has gained a lot of interest in the recent years but there is a lack for large real-world datasets to benchmark algorithms. That led us to publicly release two datasets used internally at Yahoo! for learning the web search ranking function. To promote these datasets and foster the development of state-of-the-art(More)
RankSVM (Herbrich et al. in Advances in large margin classifiers. MIT Press, Cambridge, MA, 2000; Joachims in Proceedings of the ACM conference on knowledge discovery and data mining (KDD), 2002) is a pairwise method for designing ranking models. SVMLight is the only publicly available software for RankSVM. It is slow and, due to incomplete training with(More)