• Corpus ID: 11535680

Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition

@article{Schmidt2013FastCO,
  title={Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition},
  author={Mark W. Schmidt and Nicolas Le Roux},
  journal={arXiv: Optimization and Control},
  year={2013}
}
We consider optimizing a function smooth convex function $f$ that is the average of a set of differentiable functions $f_i$, under the assumption considered by Solodov [1998] and Tseng [1998] that the norm of each gradient $f_i'$ is bounded by a linear function of the norm of the average gradient $f'$. We show that under these assumptions the basic stochastic gradient method with a sufficiently-small constant step-size has an $O(1/k)$ convergence rate, and has a linear convergence rate if $g… 

On the linear convergence of the stochastic gradient method with constant step-size

This paper provides a necessary condition, for the linear convergence of SGM-CS, that is weaker than SGC, and shows that both the projected stochastic gradient method using a constant step-size, under the restricted strong convexity assumption, exhibit linear convergence to a noise dominated region.

Towards Asymptotic Optimality with Conditioned Stochastic Gradient Descent

This paper investigates a general class of stochastic gradient descent algorithms, called conditioned SGD, based on a preconditioning of the gradient direction, and establishes the almost sure convergence and the asymptotic normality for a broad class of conditioning matrices.

Linear Convergence of Adaptive Stochastic Gradient Descent

We prove that the norm version of the adaptive stochastic gradient method (AdaGrad-Norm) achieves a linear convergence rate for a subset of either strongly convex functions or non-convex functions

Minimizing finite sums with the stochastic average gradient

Numerical experiments indicate that the new SAG method often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

Stochastic Approximation of Smooth and Strongly Convex Functions: Beyond the O(1/T) Convergence Rate

This paper makes use of smoothness and strong convexity simultaneously to boost the convergence rate of SA by demonstrating that, in expectation, an O(1/2^{T/\kappa}+F_*) risk bound is achievable and obtaining a global linear convergence.

General Convergence Analysis of Stochastic First-Order Methods for Composite Optimization

  • I. Necoara
  • Computer Science, Mathematics
    J. Optim. Theory Appl.
  • 2021
Stochastic composite convex optimization problems with the objective function satisfying a stochastic bounded gradient condition, with or without a quadratic functional growth property are considered, covering a large class of objective functions.

On the linear convergence of the projected stochastic gradient method with constant step-size

It is shown that both PSGM-CS and the proximal stochastic gradient method exhibit linear convergence to a noise dominated region, whose distance to the optimal solution proportional to $\gamma \sigma$, when SGC is violated up to a additive perturbation.

A delayed proximal gradient method with linear convergence rate

This paper derives an explicit expression that quantifies how the convergence rate depends on objective function properties and algorithm parameters such as step-size and the maximum delay, and reveals the trade-off between convergence speed and residual error.

Unified Optimal Analysis of the (Stochastic) Gradient Method

  • S. Stich
  • Computer Science, Mathematics
    ArXiv
  • 2019
This note gives a simple proof for the convergence of stochastic gradient methods on $\mu$-convex functions under a (milder than standard) $L$-smoothness assumption and recovers the exponential convergence rate.

A globally convergent incremental Newton method

It is shown that the incremental Newton method for minimizing the sum of a large number of strongly convex functions is globally convergent for a variable stepsize rule and under a gradient growth condition, convergence rate is linear for both variable and constant stepsize rules.
...

References

SHOWING 1-6 OF 6 REFERENCES

Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero

  • M. Solodov
  • Computer Science
    Comput. Optim. Appl.
  • 1998
The first convergence results of any kind for this computationally important case are derived and it is shown that a certain ε-approximate solution can be obtained and the linear dependence of ε on the stepsize limit is established.

An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule

  • P. Tseng
  • Computer Science
    SIAM J. Optim.
  • 1998
We consider an incremental gradient method with momentum term for minimizing the sum of continuously differentiable functions. This method uses a new adaptive stepsize rule that decreases the

Robust Stochastic Approximation Approach to Stochastic Programming

It is intended to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems.

EFFICIENT METHODS IN CONVEX PROGRAMMING

Introductory Lectures on Convex Optimization - A Basic Course

It was in the middle of the 1980s, when the seminal paper by Kar markar opened a new epoch in nonlinear optimization, and it became more and more common that the new methods were provided with a complexity analysis, which was considered a better justification of their efficiency than computational experiments.