Learn More
The SGD-QN algorithm is a stochastic gradient descent algorithm that makes careful use of second-order information and splits the parameter update into independently scheduled components. Thanks to this design, SGD-QN iterates nearly as fast as a first-order stochastic gradient descent but requires less iterations to achieve the same accuracy. This(More)
Optimization algorithms for large margin multiclass recognizers are often too costly to handle ambitious problems with structured outputs and exponential numbers of classes. Optimization algorithms that rely on the full gradient are not effective because, unlike the solution, the gradient is not sparse and is very large. The LaRank algorithm sidesteps this(More)
Features gathered from the observation of a phenomenon are not all equally informative: some of them may be noisy, correlated or irrelevant. Feature selection aims at selecting a feature set that is relevant for a given task. This problem is complex and remains an important issue in many domains. In the field of neural networks, feature selection has been(More)
In ranking with the pairwise classification approach, the loss associated to a predicted ranked list is the mean of the pairwise classification losses. This loss is inadequate for tasks like information retrieval where we prefer ranked lists with high precision on the top of the list. We propose to optimize a larger class of loss functions for ranking,(More)
We address the problem of designing sur-rogate losses for learning scoring functions in the context of label ranking. We extend to ranking problems a notion of order-preserving losses previously introduced for multiclass classification, and show that these losses lead to consistent formulations with respect to a family of ranking evaluation met-rics. An(More)
1. Introduction Neural Networks-NN-are used in quite a variety of real-world applications, where one can usually measure a potentially large number N of variables X i ; probably not all X i are equally informative: some should even be considered as noise to be eliminated. If one could select n << N "best" variables X i , then one could reduce the amount of(More)
1. Introduction Neural Networks-NN-have recently been used in a large variety of real-world applications. In many problems, one could measure N variables X i from a potentially large set (N large); but probably not all of these are equally informative: if one could select n << N "best" variables X i , then one could reduce the amount of data to gather and(More)
LSHTC is a series of challenges which aims to assess the performance of classification systems in large-scale classification in a a large number of classes (up to hundreds of thousands). This paper describes the dataset that have been released along the LSHTC series. The paper details the construction of the datsets and the design of the tracks as well as(More)
We study surrogate losses for learning to rank, in a framework where the rankings are induced by scores and the task is to learn the scoring function. We focus on the calibration of surrogate losses with respect to a ranking evaluation metric, where the calibration is equivalent to the guarantee that near-optimal values of the sur-rogate risk imply(More)