We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions, which achieves a faster convergence rate than black-box SG methods.Expand

This paper provides a unified framework for extending Local Linear Embedding (LLE), Isomap, Laplacian Eigenmaps, Multi-Dimensional Scaling (for dimensionality reduction) as well as Spectral Clustering.Expand

We consider the problem of optimizing the sum of a smooth convex function and a non-smooth convex term using proximal-gradient methods, where an error is present in the calculation of the gradient of the smooth term.Expand

We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions, a property similar to neural networks with one hidden layer.Expand

We show a direct relation between spectral embedding methods and kernel principal components analysis and how both are special cases of a more general learning problem: learning the principal eigenfunctions of an operator defined from a kernel and the unknown data-generating density.Expand

We propose to apply this idea in its simplest form to an object recognition system based on the spatial pyramid framework, to increase the performance of small dictionaries with very little added engineering.Expand