Newton acceleration on manifolds identified by proximal gradient methods

  title={Newton acceleration on manifolds identified by proximal gradient methods},
  author={Gilles Bareilles and Franck Iutzeler and J{\'e}r{\^o}me Malick},
  journal={Mathematical Programming},
Proximal methods are known to identify the underlying substructure of nonsmooth optimization problems. Even more, in many interesting situations, the output of a proximity operator comes with its structure at no additional cost, and convergence is improved once it matches the structure of a minimizer. However, it is impossible in general to know whether the current structure is final or not; such highly valuable information has to be exploited adaptively. To do so, we place ourselves in the case… 

Figures from this paper

Escaping Spurious Local Minima of Low-Rank Matrix Factorization Through Convex Lifting

Experiments on real-world large-scale recommendation system problems show that MF-Global can indeed effectively escapes spurious local solutions at which existing MF approaches stuck, and is magnitudes faster than state-of-the-art algorithms for the lifted convex form.

Training Structured Neural Networks Through Manifold Identification and Variance Reduction

It is proved that after a finite number of iterations, all iterates of RMDA possess a desired structure identical to that induced by the regularizer at the stationary point of asymptotic convergence, even in the presence of engineering tricks like data augmentation that complicate the training process.

Accelerated projected gradient algorithms for sparsity constrained optimization problems

Two acceleration schemes with global convergence guarantees are presented, one by same-space extrapolation and the other by subspace identification, for the nonconvex best subset selection problem that minimizes a given empirical loss function under an (cid:96) 0 -norm constraint.

Coordinate Descent for SLOPE

This work proposes a new fast algorithm to solve the SLOPE optimization problem, which combines proximal gradient descent and proximal coordinate descent steps and provides new results on the directional derivative of the SLope penalty and its related SLOPE thresholding operator.

Harnessing structure in composite nonsmooth minimization

This work shows that their method identifies the optimal smooth substructure and converges locally quadratically on two problems: the minimization of a maximum of quadratic functions and the maximization of the maximal eigenvalue of a parametrized matrix.



Accelerating Inexact Successive Quadratic Approximation for Regularized Optimization Through Manifold Identification

This work shows that for partly smooth regularizers, although general inexact solutions cannot identify the active manifold that makes the objective function smooth, approximate solutions generated by commonly-used subproblem solvers will identify this manifold, even with arbitrarily low solution precision.

Geometrical interpretation of the predictor-corrector type algorithms in structured optimization problems

This article develops sufficient conditions for quadratic convergence of predictor-corrector methods using a proximal point correction step, and argues that returning in this manner is preferable to returning via the projection mapping.

Adaptive regularization with cubics on manifolds

A generalization of ARC to optimization on Riemannian manifolds is studied, which generalizes the iteration complexity results to this richer framework and identifies appropriate manifold-specific assumptions that allow it to secure complexity guarantees both when using the exponential map and when using a general retraction.

A simple Newton method for local nonsmooth optimization

A new bundle Newton method is described that incorporates second-order objective information with the usual linear approximation oracle, and it is proved that local quadratic convergence is proved.

A Riemannian BFGS Method Without Differentiated Retraction for Nonconvex Optimization Problems

It is proven that the Riemannian BFGS method converges globally to stationary points without assuming the objective function to be convex and superlinearly to a nondegenerate minimizer.

A proximal method for composite minimization

An algorithmic framework based on a subproblem constructed from a linearized approximation to the objective and a regularization term that underlie both a global convergence result and an identification property of the active manifold containing the solution of the original problem is described.

Newton methods for nonsmooth convex minimization: connections among -Lagrangian, Riemannian Newton and SQP methods

Newton-type methods for minimization of partly smooth convex functions using Sequential Newton methods using local parameterizations obtained from -Lagrangian theory and from Riemannian geometry are studied.

Identifying active constraints via partial smoothness and prox-regularity

This work extends work of Burke, Moree and Wright on identifiable surfaces from the convex case to a general nonsmooth setting and shows how this setting can be used in the study of sufficient conditions for local minimizers.

A quasi-Newton proximal splitting method

Efficient implementations of the proximity calculation for a useful class of functions are described and an elegant quasi-Newton method is applied to acceleration of convex minimization problems, and compares favorably against state-of-the-art alternatives.

Active Sets, Nonsmoothness, and Sensitivity

It is shown under a natural regularity condition that critical points of partly smooth functions are stable: small perturbations to the function cause small movements of the critical point on the active manifold.