Gradient methods for minimizing composite functions

@article{Nesterov2013GradientMF,
  title={Gradient methods for minimizing composite functions},
  author={Yurii Nesterov},
  journal={Mathematical Programming},
  year={2013},
  volume={140},
  pages={125-161}
}
  • Y. Nesterov
  • Published 2013
  • Mathematics, Computer Science
  • Mathematical Programming
In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two terms: one is smooth and given by a black-box oracle, and another is a simple general convex function with known structure. Despite the absence of good properties of the sum, such problems, both in convex and nonconvex cases, can be solved with efficiency typical for the first part of the objective. For convex problems of the above structure, we consider primal and… 

An adaptive accelerated proximal gradient method and its homotopy continuation for sparse optimization

TLDR
An accelerated proximal gradient method is presented for problems where the smooth part of the objective function is also strongly convex, and this method incorporates an efficient line-search procedure, and achieves the optimal iteration complexity for such composite optimization problems.

Accelerated Regularized Newton Methods for Minimizing Composite Convex Functions

In this paper, we study accelerated Regularized Newton Methods for minimizing objectives formed as a sum of two functions: one is convex and twice differentiable with Holder-continuous Hessian, and

A FISTA-type accelerated gradient algorithm for solving smooth nonconvex composite optimization problems

In this paper, we describe and establish iteration-complexity of two accelerated composite gradient (ACG) variants to solve a smooth nonconvex composite optimization problem whose objective function

Efficiency of minimizing compositions of convex functions and smooth maps

TLDR
It is shown that when the subproblems can only be solved by first-order methods, a simple combination of smoothing, the prox-linear method, and a fast-gradient scheme yields an algorithm with complexity, akin to gradient descent for smooth minimization.

Accelerated inexact composite gradient methods for nonconvex spectral optimization problems

TLDR
Two inexact composite gradient methods are presented, one inner accelerated and another doubly accelerated, for solving a class of nonconvex spectral composite optimization problems and take advantage of both the composite and spectral structure underlying the objective function in order to efficiently generate their solutions.

Primal-dual fast gradient method with a model

TLDR
The main idea is the following: to find a dual solution to an approximation of a primal problem using the conception of $(\delta, L)$-model, the principle of "divide and conquer" is realized.

Complexity bounds for primal-dual methods minimizing the model of objective function

  • Y. Nesterov
  • Mathematics, Computer Science
    Math. Program.
  • 2018
TLDR
This work provides Frank–Wolfe method with a convergence analysis allowing to approach a primal-dual solution of convex optimization problem with composite objective function and justifies a new variant of this method, which can be seen as a trust-region scheme applying to the linear model of objective function.

MOCCA: Mirrored Convex/Concave Optimization for Nonconvex Composite Functions

TLDR
The MOCCA (mirrored convex/concave) algorithm is proposed, a primal/dual optimization approach that takes a local convex approximation to each term at every iteration, and offers theoretical guarantees for convergence when the overall problem is approximately convex.

Augmented Lagrangian based first-order methods for convex and nonconvex programs: nonergodic convergence and iteration complexity

TLDR
A nonergodic convergence rate result of an augmented Lagrangian (AL) based FOM for convex problems with functional constraints is established and a novel AL-based FOM is designed for problems with non-convex objective and convex constraint functions.

On Convergence Rates of Linearized Proximal Algorithms for Convex Composite Optimization with Applications

TLDR
Under the assumptions of local weak sharp minima of order $p$ ($p \in [1,2]$) and a quasi-regularity condition, a local superlinear convergence rate is established for the linearized proximal algorithm (LPA).
...

References

SHOWING 1-10 OF 25 REFERENCES

Gradient methods for minimizing composite objective function

In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two convex terms: one is smooth and given by a black-box oracle, and

Accelerating the cubic regularization of Newton’s method on convex problems

TLDR
An accelerated version of the cubic regularization of Newton’s method that converges for the same problem class with order, keeping the complexity of each iteration unchanged and arguing that for the second-order schemes, the class of non-degenerate problems is different from the standard class.

Rounding of convex sets and efficient gradient methods for linear programming problems

  • Y. Nesterov
  • Mathematics, Computer Science
    Optim. Methods Softw.
  • 2008
TLDR
It is proved that the upper complexity bound for both schemes is O((√(n ln m)/δ)ln n) iterations of a gradient-type method, where n and m are the sizes of the corresponding linear programming problems.

A generalized proximal point algorithm for certain non-convex minimization problems

TLDR
This algorithm may be viewed as a generalization of the proximal point algorithm to cope with non-convexity of the objective function by linearizing the differentiable term at each iteration.

Introductory Lectures on Convex Optimization - A Basic Course

TLDR
It was in the middle of the 1980s, when the seminal paper by Kar markar opened a new epoch in nonlinear optimization, and it became more and more common that the new methods were provided with a complexity analysis, which was considered a better justification of their efficiency than computational experiments.

Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems

TLDR
This paper proposes gradient projection algorithms for the bound-constrained quadratic programming (BCQP) formulation of these problems and test variants of this approach that select the line search parameters in different ways, including techniques based on the Barzilai-Borwein method.

Smooth minimization of non-smooth functions

TLDR
A new approach for constructing efficient schemes for non-smooth convex optimization is proposed, based on a special smoothing technique, which can be applied to functions with explicit max-structure, and can be considered as an alternative to black-box minimization.

Just relax: convex programming methods for identifying sparse signals in noise

  • J. Tropp
  • Computer Science
    IEEE Transactions on Information Theory
  • 2006
TLDR
A method called convex relaxation, which attempts to recover the ideal sparse signal by solving a convex program, which can be completed in polynomial time with standard scientific software.

Iterative solution of nonlinear equations in several variables

TLDR
Convergence of Minimization Methods An Annotated List of Basic Reference Books Bibliography Author Index Subject Index.

Atomic Decomposition by Basis Pursuit

TLDR
Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.