New analysis of linear convergence of gradient-type methods via unifying error bound conditions

@article{Zhang2020NewAO,
  title={New analysis of linear convergence of gradient-type methods via unifying error bound conditions},
  author={Hui Zhang},
  journal={Mathematical Programming},
  year={2020},
  volume={180},
  pages={371-416}
}
  • Hui Zhang
  • Published 2020
  • Mathematics, Computer Science
  • Mathematical Programming
This paper reveals that a common and central role, played in many error bound (EB) conditions and a variety of gradient-type methods, is a residual measure operator. On one hand, by linking this operator with other optimality measures, we define a group of abstract EB conditions, and then analyze the interplay between them; on the other hand, by using this operator as an ascent direction, we propose an abstract gradient-type method, and then derive EB conditions that are necessary and… Expand
Ju n 20 18 Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposedExpand
Level-set Subdifferential Error Bounds and Linear Convergence of Variable Bregman Proximal Gradient Method
In this work, we develop a level-set subdifferential error bound condition aiming towards convergence rate analysis of a variable Bregman proximal gradient (VBPG) method for a broad class ofExpand
Proximal-Like Incremental Aggregated Gradient Method with Linear Convergence Under Bregman Distance Growth Conditions
TLDR
A unified algorithmic framework for minimizing the sum of smooth convex component functions and a proper closed convex regularization function that is possibly non-smooth and extendedvalued with an additional abstract feasible set whose geometry can be captured by using the domain of a Legendre function. Expand
Variational Analysis Perspective on Linear Convergence of Some First Order Methods for Nonsmooth Convex Optimization Problems
We understand linear convergence of some first-order methods such as the proximal gradient method (PGM), the proximal alternating linearized minimization (PALM) algorithm and the randomized blockExpand
Global complexity analysis of inexact successive quadratic approximation methods for regularized optimization under mild assumptions
TLDR
This paper presents an algorithmic framework of inexact SQA methods with four types of line searches, and analyzes its global complexity under milder assumptions, showing its well-definedness and some decreasing properties. Expand
Faster subgradient methods for functions with Hölderian growth
TLDR
This manuscript derives new convergence results for several subgradient methods applied to minimizing nonsmooth convex functions with Hölderian growth and develops an adaptive variant of the “descending stairs” stepsize which achieves the same convergence rate without requiring an error bound constant which is difficult to estimate in practice. Expand
A Variational Approach on Level sets and Linear Convergence of Variable Bregman Proximal Gradient Method for Nonconvex Optimization Problems
We develop a new variational approach on level sets aiming towards convergence rate analysis of a variable Bregman proximal gradient (VBPG) method for a broad class of nonsmooth and nonconvexExpand
First-Order Methods for Convex Constrained Optimization under Error Bound Conditions with Unknown Growth Parameters
We propose first-order methods based on a level-set technique for convex constrained optimization that satisfies an error bound condition with unknown growth parameters. The proposed approach solvesExpand
Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions
TLDR
This work establishes fast and intermediate rates of an efficient stochastic approximation (SA) algorithm for risk minimization with Lipschitz continuous random functions, which requires only one pass of $n$ samples and adapts to EBC. Expand
Greed is good : greedy optimization methods for large-scale structured problems
TLDR
This dissertation shows that greedy coordinate descent and Kaczmarz methods have efficient implementations and can be faster than their randomized counterparts for certain common problem structures in machine learning, and shows linear convergence for greedy (block) coordinate descent methods under a revived relaxation of strong convexity from 1963. Expand
...
1
2
...

References

SHOWING 1-10 OF 88 REFERENCES
Linear Convergence of Proximal-Gradient Methods under the Polyak-Łojasiewicz Condition
In 1963, Polyak proposed a simple condition that is sufficient to show that gradient descent has a global linear convergence rate. This condition is a special case of the Łojasiewicz inequalityExpand
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
TLDR
This work shows that this much-older Polyak-Lojasiewicz (PL) inequality is actually weaker than the main conditions that have been explored to show linear convergence rates without strong convexity over the last 25 years, leading to simple proofs of linear convergence of these methods. Expand
From error bounds to the complexity of first-order descent methods for convex functions
TLDR
It is shown that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization and how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Expand
On the Q-linear convergence of forward-backward splitting method and uniqueness of optimal solution to Lasso
In this paper, by using tools of second-order variational analysis, we study the popular forward-backward splitting method with Beck-Teboulle's line-search for solving convex optimization problemExpand
Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry
We provide a comprehensive study of the convergence of the forward-backward algorithm under suitable geometric conditions, such as conditioning or Łojasiewicz properties. These geometrical notionsExpand
The restricted strong convexity revisited: analysis of equivalence to error bound and quadratic growth
  • Hui Zhang
  • Mathematics, Computer Science
  • Optim. Lett.
  • 2017
TLDR
The restricted strong convexity is an effective tool for deriving globally linear convergence rates of descent methods in convex minimization and a group of modified and extended versions of these three notions are proposed by using gradient mapping and proximal gradient mapping separately. Expand
Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods
TLDR
This work proves an abstract convergence result for descent methods satisfying a sufficient-decrease assumption, and allowing a relative error tolerance, that guarantees the convergence of bounded sequences under the assumption that the function f satisfies the Kurdyka–Łojasiewicz inequality. Expand
A unified approach to error bounds for structured convex optimization problems
TLDR
A new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convexfunction, is presented. Expand
Exact Worst-Case Convergence Rates of the Proximal Gradient Method for Composite Convex Minimization
We study the worst-case convergence rates of the proximal gradient method for minimizing the sum of a smooth strongly convex function and a non-smooth convex function, whose proximal operator isExpand
Calculus of the Exponent of Kurdyka–Łojasiewicz Inequality and Its Applications to Linear Convergence of First-Order Methods
TLDR
The Kurdyka–Łojasiewicz exponent is studied, an important quantity for analyzing the convergence rate of first-order methods, and various calculus rules are developed to deduce the KL exponent of new (possibly nonconvex and nonsmooth) functions formed from functions with known KL exponents. Expand
...
1
2
3
4
5
...