• Publications
  • Influence
Minimizing finite sums with the stochastic average gradient
TLDR
Numerical experiments indicate that the new SAG method often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.
A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets
TLDR
A new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex, which incorporates a memory of previous gradient values in order to achieve a linear convergence rate.
Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering
TLDR
A unified framework for extending Local Linear Embedding, Isomap, Laplacian Eigenmaps, Multi-Dimensional Scaling as well as for Spectral Clustering is provided.
Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization
TLDR
This work shows that both the basic proximal-gradient method and the accelerated proximal - gradient method achieve the same convergence rate as in the error-free case, provided that the errors decrease at appropriate rates.
Representational Power of Restricted Boltzmann Machines and Deep Belief Networks
TLDR
This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.
A latent factor model for highly multi-relational data
TLDR
This paper proposes a method for modeling large multi-relational datasets, with possibly thousands of relations, based on a bilinear structure, which captures various orders of interaction of the data and also shares sparse latent factors across different relations.
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA
In this letter, we show a direct relation between spectral embedding methods and kernel principal components analysis and how both are special cases of a more general learning problem: learning the
Ask the locals: Multi-way local pooling for image recognition
TLDR
This work argues that a common trait found in much recent work in image recognition or retrieval is that it leverages locality in feature space on top of purely spatial locality, and proposes to apply this idea in its simplest form to an object recognition system based on the spatial pyramid framework to increase the performance of small dictionaries with very little added engineering.
Topmoumoute Online Natural Gradient Algorithm
TLDR
An efficient, general, online approximation to the natural gradient descent which is suited to large scale problems and much faster convergence in computation time and in number of iterations with TONGA than with stochastic gradient descent, even on very large datasets.
...
...