• Corpus ID: 88522834

# On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization

@article{Liu2017OnNN,
title={On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization},
author={Mingrui Liu and Tianbao Yang},
journal={arXiv: Optimization and Control},
year={2017}
}
• Published 25 September 2017
• Computer Science
• arXiv: Optimization and Control
The Hessian-vector product has been utilized to find a second-order stationary solution with strong complexity guarantee (e.g., almost linear time complexity in the problem's dimensionality). In this paper, we propose to further reduce the number of Hessian-vector products for faster non-convex optimization. Previous algorithms need to approximate the smallest eigen-value with a sufficient precision (e.g., $\epsilon_2\ll 1$) in order to achieve a sufficiently accurate second-order stationary…

## Tables from this paper

NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization
• Computer Science
• 2017
By leveraging the proposed AG methods for extracting the negative curvature, this work presents a new AG algorithm with double loops for non-convex optimization, improving that of gradient descent method by a factor of $\epsilon^{-0.25}$ and matching the best iteration complexity of second-order Hessian-free methods for non -conveX optimization.
Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization
• Computer Science, Mathematics
AISTATS
• 2019
A stochastic variance-reduced cubic-regularized (SVRC) Newton's method under both sampling with and without replacement schemes, which improves that of CR as well as other sub-sampling variant methods via the variance reduction scheme.
Convergence of Cubic Regularization for Nonconvex Optimization under KL Property
• Computer Science, Mathematics
NeurIPS
• 2018
The asymptotic convergence rate of CR is explored by exploiting the ubiquitous Kurdyka-Lojasiewicz (KL) property of the nonconvex objective functions including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix.
Cubic Regularization with Momentum for Nonconvex Optimization
• Computer Science
UAI
• 2019
Theoretically, it is proved that CR under momentum achieves the best possible convergence rate to a second-order stationary point for nonconvex optimization, and the proposed algorithm can allow computational inexactness that reduces the overall sample complexity without degrading the convergence rate.
• Computer Science
ArXiv
• 2021
CCubic-GDA is developed – the first GDA-type algorithm for escaping strict saddle points in nonconvex-stronglyconcave minimax optimization and achieves an orderwise faster convergence rate than the standard GDA for a wide spectrum of gradient dominant geometry.
On the Second-order Convergence Properties of Random Search Methods
• Computer Science, Mathematics
NeurIPS
• 2021
A novel variant of random search that exploits negative curvature by only relying on function evaluations is proposed, and it is proved that this approach converges to a second-order stationary point at a much faster rate than vanilla methods.
A Subsampling Line-Search Method with Second-Order Results.
• Computer Science
• 2018
A stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective is described, which encompasses the deterministic regime, and allows us to identify sampling requirements for second-order line-search paradigms.
First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time
• Computer Science
NeurIPS
• 2018
A novel perspective of noise-adding technique, i.e., adding the noise into the first-order information can help extract the negative curvature from the Hessian matrix is presented, and a formal reasoning of this perspective is provided by analyzing a simple first- order procedure.
Exploiting negative curvature in deterministic and stochastic optimization
• Computer Science
Math. Program.
• 2019
New frameworks for combining descent and negative curvature directions are presented: alternating two-step approaches and dynamic step approaches that make algorithmic decisions based on (estimated) upper-bounding models of the objective function.
Regional complexity analysis of algorithms for nonconvex smooth optimization
• Computer Science
Math. Program.
• 2021
A strategy is proposed for characterizing the worst-case performance of algorithms for solving nonconvex smooth optimization problems over regions defined by first- and second-order derivatives and for analyzing the behavior of higher-order algorithms.

## References

SHOWING 1-10 OF 29 REFERENCES
Gradient Descent Efficiently Finds the Cubic-Regularized Non-Convex Newton Step
• Mathematics, Computer Science
ArXiv
• 2016
We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima. Nonetheless, we prove that, under mild assumptions,
Accelerated Methods for Non-Convex Optimization
• Computer Science
ArXiv
• 2016
The method improves upon the complexity of gradient descent and provides the additional second-order guarantee that $\nabla^2 f(x) \succeq -O(\epsilon^{1/2})I$ for the computed $x$.
How to Escape Saddle Points Efficiently
• Computer Science, Mathematics
ICML
• 2017
This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension, which shows that perturbed gradient descent can escape saddle points almost for free.
Sub-sampled Cubic Regularization for Non-convex Optimization
• Computer Science, Mathematics
ICML
• 2017
This work provides a sampling scheme that gives sufficiently accurate gradient and Hessian approximations to retain the strong global and local convergence guarantees of cubically regularized methods, and is the first work that gives global convergence guarantees for a sub-sampled variant of cubic regularization on non-convex functions.
• Computer Science, Mathematics
COLT
• 2015
This paper identifies strict saddle property for non-convex problem that allows for efficient optimization of orthogonal tensor decomposition, and shows that stochastic gradient descent converges to a local minimum in a polynomial number of iterations.
Finding approximate local minima faster than gradient descent
• Computer Science
STOC
• 2017
We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of
Gradient methods for minimizing composite objective function
In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two convex terms: one is smooth and given by a black-box oracle, and
Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
• Computer Science
Math. Program.
• 2016
A randomized stochastic projected gradient (RSPG) algorithm, in which proper mini-batch of samples are taken at each iteration depending on the total budget of Stochastic samples allowed, is proposed, which shows nearly optimal complexity of the algorithm for convex stoChastic programming.
Complexity Analysis of Second-Order Line-Search Algorithms for Smooth Nonconvex Optimization
• Computer Science
SIAM J. Optim.
• 2018
This paper presents an algorithm with favorable complexity properties that differs in two significant ways from other recently proposed methods, based on line searches only: Each step involves computation of a search direction, followed by a backtracking line search along that direction.
Even Faster SVD Decomposition Yet Without Agonizing Pain
• Computer Science
NIPS
• 2016
A new framework for SVD is put forward and the first accelerated AND stochastic method outperforming outperforms [2] in the running-time regime and outperform in certain parameter regimes without even using alternating minimization.