• Publications
  • Influence
Minimizing finite sums with the stochastic average gradient
TLDR
We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions, which achieves a faster convergence rate than black-box SG methods. Expand
A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets
TLDR
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. Expand
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
TLDR
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. Expand
Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization
TLDR
We consider the problem of optimizing the sum of a smooth convex function and a non-smooth convex term using proximal-gradient methods, where an error is present in the calculation of the gradient of the smooth term. Expand
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs
TLDR
We propose a novel randomized block-coordinate version of the Frank-Wolfe algorithm for structural SVMs that achieves an Õ(1/ε) convergence rate while only requiring a single call to the maximization oracle on each iteration. Expand
Accelerated training of conditional random fields with stochastic gradient methods
TLDR
We apply Stochastic Meta-Descent (SMD), a stochastic gradient optimization method with gain vector adaptation, to the training of Conditional Random Fields (CRFs). Expand
Hybrid Deterministic-Stochastic Methods for Data Fitting
TLDR
By controlling the sample size in an incremental gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods. Expand
Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches
TLDR
L1 regularization is effective for feature selection, but the resulting optimization is challenging due to the non-differentiability of the 1-norm. Expand
StopWasting My Gradients: Practical SVRG
TLDR
We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. Expand
Learning Graphical Model Structure Using L1-Regularization Paths
TLDR
Sparsity-promoting L1-regularization has recently been succesfully used to learn the structure of undirected graphical models . Expand
...
1
2
3
4
5
...