Learn More
Metric learning has attracted a lot of interest over the last decade, but little work has been done about the generalization ability of such methods. In this paper, we address this issue by proposing an adaptation of the notion of algorithmic robustness, previously introduced by Xu and Mannor in classic supervised learning, to derive generalization bounds(More)
The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a(More)
In this paper, we investigate how to scale up kernel methods to take on large-scale problems, on which deep neural networks have been prevailing. To this end, we leverage existing techniques and develop new ones. These techniques include approximating kernel functions with features derived from random projections, parallel training of kernel models with 100(More)
In recent years, the crucial importance of metrics in machine learning algorithms has led to an increasing interest for optimizing distance and similarity functions. Most of the state of the art focus on learning Maha-lanobis distances (requiring to fulfill a constraint of positive semi-definiteness) for use in a local k-NN algorithm. However, no(More)
Similarity functions are a fundamental component of many learning algorithms. When dealing with string or tree-structured data, measures based on the edit distance are widely used, and there exist a few methods for learning them from data. However, these methods offer no theoretical guarantee as to the generalization ability and discriminative power of the(More)
During the past few years, several works have been done to derive string kernels from probability distributions. For instance, the Fisher kernel uses a generative model M (e.g. a hidden markov model) and compares two strings according to how they are generated by M. On the other hand, the marginalized kernels allow the computation of the joint similarity(More)
Similarity and distance functions are essential to many learning algorithms, thus training them has attracted a lot of interest. When it comes to dealing with structured data (e.g., strings or trees), edit similarities are widely used, and there exists a few methods for learning them. However, these methods offer no theoretical guarantee as to the(More)
Learning sparse combinations is a frequent theme in machine learning. In this paper, we study its associated optimization problem in the distributed setting where the elements to be combined are not centrally located but spread over a network. We address the key challenges of balancing communication costs and optimization errors. To this end, we propose a(More)