Corpus ID: 204009004

Nonconvex stochastic optimization on manifolds via Riemannian Frank-Wolfe methods

  title={Nonconvex stochastic optimization on manifolds via Riemannian Frank-Wolfe methods},
  author={Melanie Weber and Suvrit Sra},
We study stochastic projection-free methods for constrained optimization of smooth functions on Riemannian manifolds (where there are additional constraints beyond the parameter domain being a manifold). Specifically, we introduce stochastic Riemannian Frank-Wolfe methods for both nonconvex and geodesically convex problems. We present algorithms for both stochastic optimization and finite-sum problems. For the latter, we develop variance reduced methods, including a Riemannian adaptation of the… Expand
Zeroth-order Optimization on Riemannian Manifolds
This paper considers stochastic zeroth-order optimization over Riemannian submanifolds embedded in an Euclidean space, an important but less studied area, and proposes four algorithms for solving this class of problems under different settings. Expand
Stochastic Zeroth-order Riemannian Derivative Estimation and Optimization
We consider stochastic zeroth-order optimization over Riemannian submanifolds embedded in Euclidean space, where the task is to solve Riemannian optimization problem with only noisy objectiveExpand
Acceleration in Hyperbolic and Spherical Spaces
We further research on the acceleration phenomenon on Riemannian manifolds by introducing the first global first-order method that achieves the same rates as accelerated gradient descent in theExpand
No-go Theorem for Acceleration in the Hyperbolic Plane
It is proved that in a noisy setting, there is no analogue of accelerated gradient descent for geodesically convex functions on the hyperbolic plane. Expand
From Nesterov's Estimate Sequence to Riemannian Acceleration
This work revisits Nesterov's estimate sequence technique and develops an alternative analysis for it, and extends this analysis to the Riemannian setting, localizing the key difficulty due to non-Euclidean structure into a certain ``metric distortion. Expand
Majorization-Minimization on the Stiefel Manifold With Application to Robust Sparse PCA
A framework for optimizing cost functions of orthonormal basis learning problems, such as principal component analysis (PCA), subspace recovery, orthogonal dictionary learning, etc, is proposed and a new set of algorithms for sparse PCA driven by this methodology is proposed. Expand
Gradient descent algorithms for Bures-Wasserstein barycenters
A framework to derive global rates of convergence for both gradient descent and stochastic gradient descent despite the fact that the barycenter functional is not geodesically convex is developed by employing a Polyak-Lojasiewicz (PL) inequality. Expand
Riemannian Optimization via Frank-Wolfe Methods.
Non-asymptotic convergence rates of RFW to an optimum for (geodesically) convex problems, and to a critical point for nonconvex objectives, are analyzed and a practical setting under which RFW can attain a linear convergence rate is presented. Expand


Frank-Wolfe methods for geodesically convex optimization with application to the matrix geometric mean
Under an additional hypothesis, it is proved how Riemannian FW can even attain a linear rate of convergence, and these are the first results on Riemanian FW and its convergence. Expand
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds
This paper introduces Riemannian SVRG (RSVRG), a new variance reduced RiemANNian optimization method, and presents the first non-asymptotic complexity analysis (novel even for the batch setting) for nonconvex Riem Mannian optimization. Expand
Averaging Stochastic Gradient Descent on Riemannian Manifolds
A geometric framework is developed to transform a sequence of slowly converging iterates generated from stochastic gradient descent on a Riemannian manifold to an averaged iterate sequence with a robust and fast $O(1/n)$ convergence rate. Expand
A Riemannian BFGS Method Without Differentiated Retraction for Nonconvex Optimization Problems
It is proven that the Riemannian BFGS method converges globally to stationary points without assuming the objective function to be convex and superlinearly to a nondegenerate minimizer. Expand
Riemannian Stochastic Recursive Gradient Algorithm
Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions on a Riemannian manifold. The present paper proposes aExpand
First-order Methods for Geodesically Convex Optimization
This work is the first to provide global complexity analysis for first-order algorithms for general g-convex optimization, and proves upper bounds for the global complexity of deterministic and stochastic (sub)gradient methods for optimizing smooth and nonsmooth g- Convex functions, both with and without strong g-Convexity. Expand
Riemannian stochastic variance reduced gradient
A novel Riemannian extension of the Euclidean stochastic variance reduced gradient (R-SVRG) algorithm to a manifold search space for minimizing the average of a large but finite number of loss functions. Expand
Stochastic Frank-Wolfe methods for nonconvex optimization
For objective functions that decompose into a finite-sum, ideas from variance reduction for convex optimization are leveraged to obtain new variance reduced nonconvex Frank- Wolfe methods that have provably faster convergence than the classical Frank-Wolfe method. Expand
R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate
This work adapts the recently proposed SPIDER algorithm to Riemannian manifold, and can achieve faster rate than known algorithms in both the finite sum and stochastic settings. Expand
Simple Algorithms for Optimization on Riemannian Manifolds with Constraints
This work extends an augmented Lagrangian method and smoothed versions of an exact penalty method to the Riemannian case, together with some fundamental convergence results that indicate some gains in computational efficiency and accuracy in some regimes for minimum balanced cut, non-negative PCA and k-means, especially in high dimensions. Expand