• Corpus ID: 214611977

# Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling

@article{Hsieh2020ExploreAU,
title={Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling},
author={Yu-Guan Hsieh and Franck Iutzeler and J{\'e}r{\^o}me Malick and P. Mertikopoulos},
journal={ArXiv},
year={2020},
volume={abs/2003.10162}
}
• Published 23 March 2020
• Computer Science
• ArXiv
Owing to their stability and convergence speed, extragradient methods have become a staple for solving large-scale saddle-point problems in machine learning. The basic premise of these algorithms is the use of an extrapolation step before performing an update; thanks to this exploration step, extra-gradient methods overcome many of the non-convergence issues that plague gradient descent/ascent schemes. On the other hand, as we show in this paper, running vanilla extragradient with stochastic…

## Figures from this paper

On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems
• Computer Science
NeurIPS
• 2020
This paper analyzes the trajectories of stochastic gradient descent (SGD) to help understand the algorithm's convergence properties in non-convex problems. We first show that the sequence of iterates
• Computer Science
ICLR
• 2021
A new family of min-max optimization algorithms are presented that automatically exploit the geometry of the gradient data observed at earlier iterations to perform more informative extra-gradient steps in later ones, and achieves order-optimal convergence rates.
Tight last-iterate convergence rates for no-regret learning in multi-player games
• Computer Science
NeurIPS
• 2020
The optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of $O(1/\sqrt{T})$ with respect to the gap function in smooth monotone games.
Optimality and Stability in Non-Convex Smooth Games
• Mathematics
J. Mach. Learn. Res.
• 2022
A unified approach to "local optimal" points in non-convex smooth games, which includes local Nash equilibria, local minimax points and local robust points for quadratic games, are provided and their many special properties are demonstrated.
Stochastic Extragradient: General Analysis and Improved Rates
• Computer Science
AISTATS
• 2022
A novel theoretical framework is developed that allows us to analyze several variants of SEG in a unified manner and outperform the current state-of-the-art convergence guarantees and rely on less restrictive assumptions.
The Last-Iterate Convergence Rate of Optimistic Mirror Descent in Stochastic Variational Inequalities
• Computer Science
COLT
• 2021
This paper analyzes the local convergence rate of optimistic mirror descent methods in stochastic variational inequalities, a class of optimization problems with important applications to learning theory and machine learning, and quantifies this relation by means of the Legendre exponent.
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging
• Geology
AISTATS
• 2022
The stochastic bilinear minimax optimization problem is studied, an analysis of the same-sample Stochastic ExtraGradient method with constant step size is presented, and variations of the method that yield favorable convergence are presented.
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
• Computer Science, Mathematics
ArXiv
• 2021
This paper revisits the classical entropy regularized policy gradient methods with the soft-max policy parametrization and proposes the first set of (nearly) unbiased stochastic policy gradient estimators with trajectory-level entropy regularization, and proves that although the estimators themselves are unbounded in general due to the additional logarithmic policy rewards, the variances are uniformly bounded.
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity
• Computer Science, Mathematics
NeurIPS
• 2021
The expected co-coercivity condition is introduced, its benefits are explained, and the first last-iterate convergence guarantees of SGDA and SCO under this condition are provided for solving a class of stochastic variational inequality problems that are potentially non-monotone.
Stochastic Projective Splitting: Solving Saddle-Point Problems with Multiple Regularizers
• Computer Science
ArXiv
• 2021
This proposal is the first version of PS able to use stochastic (as opposed to deterministic) gradient oracles and is also the first stoChastic method that can solve min-max games while easily handling multiple constraints and nonsmooth regularizers via projection and proximal operators.

## References

SHOWING 1-10 OF 48 REFERENCES
• Computer Science
AISTATS
• 2020
This work fixes a fundamental issue in the stochastic extragradient method by providing a new sampling strategy that is motivated by approximating implicit updates, and proves guarantees for solving variational inequality that go beyond existing settings.
ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs
• Computer Science
ArXiv
• 2019
This work analyzes last-iterate convergence of simultaneous gradient descent (simGD) and its variants under the assumption of convex-concavity, guided by a continuous-time analysis with differential equations.
Optimal stochastic extragradient schemes for pseudomonotone stochastic variational inequality problems and their variants
• Mathematics
Comput. Optim. Appl.
• 2019
This work presents an extragradient-based stochastic approximation scheme and proves that the iterates converge to a solution of the original problem under either pseudomonotonicity requirements or a suitably defined acute angle condition.
On the convergence of single-call stochastic extra-gradient methods
• Computer Science
NeurIPS
• 2019
A synthetic view of Extra-Gradient algorithms is developed, and it is shown that they retain a $\mathcal{O}(1/t)$ ergodic convergence rate in smooth, deterministic problems.
• Computer Science, Mathematics
AISTATS
• 2020
This paper shows that both EG and OGDA admit a unified analysis as approximations of the classical proximal point method for solving saddle point problems, and develops a new framework for analyzing EG andOGDA for bilinear and strongly convex-strongly concave settings.
• Computer Science
ICLR
• 2019
This work analyzes the behavior of mirror descent in a class of non-monotone problems whose solutions coincide with those of a naturally associated variational inequality-a property which it is called coherence, and shows that optimistic mirror descent (OMD) converges in all coherent problems.
Convergence rate analysis of iteractive algorithms for solving variational inequality problems
• M. Solodov
• Mathematics, Computer Science
Math. Program.
• 2003
A unified convergence rate analysis of iterative methods for solving the variational inequality problem is presented, based on certain error bounds; they subsume and extend the linear and sublinear rates of convergence established in several previous studies.
Solving variational inequalities with Stochastic Mirror-Prox algorithm
• Mathematics, Computer Science
• 2008
A novel Stochastic Mirror-Prox algorithm is developed for solving s.v.i. variational inequalities with monotone operators and it is shown that with the convenient stepsize strategy it attains the optimal rates of convergence with respect to the problem parameters.
Convergence Behaviour of Some Gradient-Based Methods on Bilinear Zero-Sum Games
• Computer Science
ICLR 2020
• 2019
This work restricts itself to bilinear zero-sum games and gives a systematic analysis of popular gradient updates, for both simultaneous and alternating versions, and offers formal evidence that alternating updates converge "better" than simultaneous ones.
Regularized Iterative Stochastic Approximation Methods for Stochastic Variational Inequality Problems
• Mathematics
IEEE Transactions on Automatic Control
• 2013
This work introduces two classes of stochastic approximation methods, each of which requires exactly one projection step at every iteration, and provides convergence analysis for each of them.