• Corpus ID: 211010703

Complexity Guarantees for Polyak Steps with Momentum

@inproceedings{Barre2020ComplexityGF,
  title={Complexity Guarantees for Polyak Steps with Momentum},
  author={Mathieu Barr'e and Adrien B. Taylor and Alexandre d’Aspremont},
  booktitle={Annual Conference Computational Learning Theory},
  year={2020}
}
In smooth strongly convex optimization, knowledge of the strong convexity parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and… 

Figures from this paper

T RAINABLE L EARNING R ATE

  • Computer Science
  • 2021
The rationale behind the approach is to train the learning rate along with the model weights, akin to line-search, and formulate first and second-order gradients with respect to learning rate as functions of consecutive weight gradients, leading to a cost-effective implementation.

Quadratic minimization: from conjugate gradient to an adaptive Heavy-ball method with Polyak step-sizes

An adaptive variation on the classical Heavy-ball method for convex quadratic minimization that relies on so-called “Polyak step-sizes”, which consists in using the knowledge of the optimal value of the optimization problem at hand instead of problem parameters such as a few eigenvalues of the Hessian of the problem.

Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses

A computer-assisted approach to the analysis of the worst-case convergence of nonlinear conjugate gradient methods (NCGMs), which establishes novel complexity bounds for the Polak-Ribière-Polyak and the FletcherReeves NCGMs for smooth strongly convex minimization.

Iteration Complexity of Fixed-Step-Momentum Methods for Convex Quadratic Functions

This note considers the momentum method without line search but with fixed step length applied to strictly convex quadratic functions assuming that exact gradients are used and appropriate upper and

Iteration Complexity of Fixed-Step Methods by Nesterov and Polyak for Convex Quadratic Functions

This note considers the momentum method by Polyak and the accelerated gradient method by Nesterov, both without line search but with fixed step length applied to strictly convex quadratic functions

Branch-and-Bound Performance Estimation Programming: A Unified Methodology for Constructing Optimal Optimization Methods

The BnB-PEP methodology is applied to several setups for which the prior methodologies do not apply and obtain methods with bounds that improve upon prior state-of-the-art results, thereby systematically generating analytical convergence proofs.

Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators

It is shown that worst-case guarantees for algorithms relying on such inexact proximal operations can be systematically obtained through a generic procedure based on semidefinite programming.

Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction

This work proves the convergence of Bregman Stochastic Gradient Descent to a region that depends on the noise (magnitude of the gradients) at the optimum and shows that variance reduction can be used to counter the effect of noise.

Acceleration via Fractal Learning Rate Schedules

This work reinterprets an iterative algorithm from the numerical analysis literature as what it is called the Chebyshev learning rate schedule for accelerating vanilla gradient descent, and shows that the problem of mitigating instability leads to a fractal ordering of step sizes.

Second-order Conditional Gradient Sliding.

The SOCGS algorithm is presented, which uses a projection-free algorithm to solve the constrained quadratic subproblems inexactly and is useful when the feasible region can only be accessed efficiently through a linear optimization oracle, and computing first-order information of the function, although possible, is costly.

References

SHOWING 1-10 OF 47 REFERENCES

Smooth strongly convex interpolation and exact worst-case performance of first-order methods

We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex

Revisiting the Polyak step size

This paper revisits the Polyak step size schedule for convex optimization problems, proving that a simple variant of it simultaneously attains near optimal convergence rates for the gradient descent

An Introduction to Optimization

  • E. ChongS. Żak
  • Computer Science
    IEEE Antennas and Propagation Magazine
  • 1996
An Introduction to Optimization, Second Edition helps students build a solid working knowledge of the field, including unconstrained optimization, linear programming, and constrained optimization.

Gradient methods for minimizing composite functions

  • Y. Nesterov
  • Mathematics, Computer Science
    Math. Program.
  • 2013
In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two terms: one is smooth and given by a black-box oracle, and another is

Performance of first-order methods for smooth convex minimization: a novel approach

A novel approach for analyzing the worst-case performance of first-order black-box optimization methods, which focuses on smooth unconstrained convex minimization over the Euclidean space and derives a new and tight analytical bound on its performance.

Lectures on Convex Optimization

This website has lectures on convex optimization to read, not just read, however likewise download them and even read online, as well as obtain the data in the types of txt, zip, kindle, word, ppt, pdf, aswell as rar.

Incremental subgradient methods for nondifferentiable optimization

  • A. NedićD. Bertsekas
  • Mathematics, Computer Science
    Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)
  • 1999
The convergence properties of a number of variants of incremental subgradient methods, including some that are stochastic are established, which appear very promising and effective for important classes of large problems.

Subgradient methods. lecture notes of EE392o, Stanford University

  • Autumn Quarter,
  • 2003