First-Order Methods for Nonconvex Quadratic Minimization

@article{Carmon2020FirstOrderMF,
  title={First-Order Methods for Nonconvex Quadratic Minimization},
  author={Yair Carmon and John C. Duchi},
  journal={SIAM Rev.},
  year={2020},
  volume={62},
  pages={395-436}
}
We consider minimization of indefinite quadratics with either trust-region (norm) constraints or cubic regularization. Despite the nonconvexity of these problems we prove that, under mild assumptions, gradient descent converges to their global solutions, and give a non-asymptotic rate of convergence for the cubic variant. We also consider Krylov subspace solutions and establish sharp convergence guarantees to the solutions of both trust-region and cubic-regularized problems. Our rates mirror… Expand
Adaptive exact penalty DC algorithms for nonsmooth DC optimization problems with equality and inequality constraints
We propose and study two DC (difference of convex functions) algorithms based on exact penalty functions for solving nonsmooth DC optimization problems with nonsmooth DC equality and inequalityExpand
Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization Domain
TLDR
The general approximation result then leads to efficient algorithms for finding a near-stationary point in nonconvex-nonconcave min-max problems, for which the upper bounds are nearly optimal. Expand
Stochastic Approximation for Online Tensorial Independent Component Analysis
TLDR
This paper presents a convergence analysis for an online tensorial ICA algorithm, by viewing the problem as a nonconvex stochastic approximation problem, and provides a dynamics-based analysis to prove that the algorithm with a specific choice of stepsize achieves a sharp finite-sample error bound. Expand
Convex optimization based on global lower second-order models
TLDR
This paper proves the global rate of convergence in functional residual, where $k$ is the iteration counter, minimizing convex functions with Lipschitz continuous Hessian, significantly improves the previously known bound $\mathcal{O}(1/k)$ for this type of algorithms. Expand

References

SHOWING 1-10 OF 61 REFERENCES
Introductory Lectures on Convex Optimization - A Basic Course
TLDR
It was in the middle of the 1980s, when the seminal paper by Kar markar opened a new epoch in nonlinear optimization, and it became more and more common that the new methods were provided with a complexity analysis, which was considered a better justification of their efficiency than computational experiments. Expand
The modification of Newton’s method for unconstrained optimization by bounding cubic terms
  • Technical report, Technical report NA/12,
  • 1981
Matrix computations
Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step
TLDR
The minimization of a nonconvex quadratic form regularized by a cubic term is considered, which may exhibit saddle points and a suboptimal local minimum, but it is proved that, under mild assump, the minimization is correct. Expand
Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems
TLDR
Convergence rates for Krylov subspace solutions to the trust-region and cubic-regularized (nonconvex) quadratic problems and lower bounds of the form $1/t^2$ and $e-4t/\sqrt{\kappa}}$ are provided. Expand
A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization
We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and the linear conjugate gradient algorithm, with explicit detection and use ofExpand
Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution
TLDR
By marrying statistical modeling with generic optimization theory, a general recipe for analyzing the trajectories of iterative algorithms via a leave-one-out perturbation argument is developed, establishing that gradient descent achieves near-optimal statistical and computational guarantees without explicit regularization. Expand
Lower bounds for finding stationary points I
TLDR
The lower bounds are sharp to within constants, and they show that gradient descent, cubic-regularized Newton’s method, and generalized pth order regularization are worst-case optimal within their natural function classes. Expand
Newton-type methods for non-convex optimization under inexact Hessian information
TLDR
The canonical problem of finite-sum minimization is considered, and appropriate uniform and non-uniform sub-sampling strategies are provided to construct such Hessian approximations, and optimal iteration complexity is obtained for the correspondingSub-sampled trust-region and adaptive cubic regularization methods. Expand
Nonlinear Optimization Methods for Large-Scale Learning
...
1
2
3
4
5
...