• Corpus ID: 14606229

On the Global Linear Convergence of Frank-Wolfe Optimization Variants

@inproceedings{LacosteJulien2015OnTG,
  title={On the Global Linear Convergence of Frank-Wolfe Optimization Variants},
  author={Simon Lacoste-Julien and Martin Jaggi},
  booktitle={NIPS},
  year={2015}
}
The Frank-Wolfe (FW) optimization algorithm has lately re-gained popularity thanks in particular to its ability to nicely handle the structured constraints appearing in machine learning applications. However, its convergence rate is known to be slow (sublinear) when the solution lies at the boundary. A simple less-known fix is to add the possibility to take 'away steps' during optimization, an operation that importantly does not require a feasibility oracle. In this paper, we highlight and… 

Figures from this paper

Non-convex Optimization with Frank-Wolfe Algorithm and Its Variants
TLDR
A unified convergence analysis for FW algorithm and its variants under the setting of nonconvex but smooth objective with a convex, compact constraint set and a novel observation on the so-called Frank-Wolfe gap is presented.
Linearly Convergent Frank-Wolfe with Backtracking Line-Search.
TLDR
Variants of Away-steps and Pairwise FW that lift both restrictions simultaneously and inherit all the favorable convergence properties of the exact line-search version, including linear convergence for strongly convex functions over polytopes are proposed.
Frank-Wolfe type methods for nonconvex inequality-constrained problems
TLDR
This paper proposes a new FW-type method for minimizing a smooth function over a compact set defined by a single nonconvex inequality constraint, based on new generalized linear-optimization oracles (LO).
Linearly Convergent Frank-Wolfe Made Practical
TLDR
This paper proposes simple modifications of AFW and PFW that lift both restrictions simultaneously and only requires evaluation and gradient oracles on the objective, along with an approximate solution to the FrankWolfe linear subproblems.
Revisiting Frank-Wolfe for Polytopes: Strict Complementarity and Sparsity
  • D. Garber
  • Computer Science, Mathematics
    NeurIPS
  • 2020
TLDR
This paper revisits the addition of a strict complementarity assumption already considered in Wolfe's classical book, and proves that under this condition, the Frank-Wolfe method with away-steps and line-search converges linearly with rate that depends explicitly only on the dimension of the optimal face.
Frank-Wolfe with Subsampling Oracle
TLDR
Two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm are analyzed, one of which achieves a sublinear convergence rate as in the deterministic counterpart, and the other reaches linear (i.e., exponential) convergence rate making it the first provably convergent randomized variant of Away-step FW.
Boosting Frank-Wolfe by Chasing Gradients
TLDR
This paper proposes to speed up the Frank-Wolfe algorithm by better aligning the descent direction with that of the negative gradient via a subroutine, and derives convergence rates to $\mathcal{O}(1/t)$ to $e^{-\omega t})$ of the method.
One Sample Stochastic Frank-Wolfe
TLDR
This paper proposes the first one-sample stochastic Frank-Wolfe algorithm, called 1-SFW, that avoids the need to carefully tune the batch size, step size, learning rate, and other complicated hyper parameters, and achieves the optimal convergence rate of $\mathcal{O}(1/\epsilon^2)$.
Acceleration of Frank-Wolfe Algorithms with Open Loop Step-Sizes
TLDR
A general setting for which Frank-Wolfe algorithms with open loop step-size rules converges non-asymptotically faster than with line search or short-step is characterized, several accelerated convergence results for FW are derived, and potential gaps are highlighted in current understanding of the FW method in general.
Frank-Wolfe Method is Automatically Adaptive to Error Bound Condition
TLDR
It is shown that the FW method (with a line search for the step size) for optimization over a strongly convex set is automatically adaptive to the error bound condition of the problem.
...
...

References

SHOWING 1-10 OF 48 REFERENCES
Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets
TLDR
This paper proves that the vanila FW method converges at a rate of 1/t2, and shows that various balls induced by lp norms, Schatten norms and group norms are strongly convex on one hand and on the other hand, linear optimization over these sets is straightforward and admits a closed-form solution.
Linearly convergent away-step conditional gradient for non-strongly convex functions
TLDR
A variant of the algorithm and an analysis based on simple linear programming duality arguments, as well as corresponding error bounds are provided, which enables the incorporation of the additional linear term and depends on a new constant, that is explicitly expressed in terms of the problem’s parameters and the geometry of the feasible set.
Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization
TLDR
A new general framework for convex optimization over matrix factorizations, where every Frank-Wolfe iteration will consist of a low-rank update, is presented, and the broad application areas of this approach are discussed.
Provable Submodular Minimization using Wolfe's Algorithm
TLDR
A maiden convergence analysis of Wolfe's algorithm is given and a robust version of Fujishige's theorem is proved which shows that an O(1/n2)-approximate solution to the min-norm point on the base polytope implies exact submodular minimization.
Barrier Frank-Wolfe for Marginal Inference
We introduce a globally-convergent algorithm for optimizing the tree-reweighted (TRW) variational objective over the marginal polytope. The algorithm is based on the conditional gradient method
The Complexity of Large-scale Convex Programming under a Linear Optimization Oracle
TLDR
This paper formally establishes the theoretical optimality or nearly optimality, in the large-scale case, for the CG method and its variants to solve different classes of CP problems, including smooth, nonsmooth and certain saddle-point problems.
On the von Neumann and Frank-Wolfe Algorithms with Away Steps
TLDR
It is shown that under the weaker condition that the origin is in the polytope, possibly on its boundary, a variant of the von Neumann algorithm that includes generates a sequence of points in thepolytope that converges linearly to zero.
An Affine Invariant Linear Convergence Analysis for Frank-Wolfe Algorithms
TLDR
This work shows the linear convergence of the standard Frank- Wolfe algorithm when the solution is in the interior of the domain, but with affine invariant constants, and the away-steps variant of the Frank-Wolfe algorithm with constants which only depend on the geometry of thedomain, and not the location of the optimal solution.
Introductory Lectures on Convex Optimization - A Basic Course
TLDR
It was in the middle of the 1980s, when the seminal paper by Kar markar opened a new epoch in nonlinear optimization, and it became more and more common that the new methods were provided with a complexity analysis, which was considered a better justification of their efficiency than computational experiments.
...
...