• Corpus ID: 166228691

Revisiting Stochastic Extragradient

@inproceedings{Mishchenko2020RevisitingSE,
  title={Revisiting Stochastic Extragradient},
  author={Konstantin Mishchenko and D. Kovalev and Egor Shulgin and Peter Richt{\'a}rik and Yura Malitsky},
  booktitle={AISTATS},
  year={2020}
}
We fix a fundamental issue in the stochastic extragradient method by providing a new sampling strategy that is motivated by approximating implicit updates. Since the existing stochastic extragradient algorithm, called Mirror-Prox, of (Juditsky et al., 2011) diverges on a simple bilinear problem when the domain is not bounded, we prove guarantees for solving variational inequality that go beyond existing settings. Furthermore, we illustrate numerically that the proposed variant converges faster… 

Figures and Tables from this paper

Training Generative Adversarial Networks via Stochastic Nash Games.
TLDR
A stochastic relaxed forward-backward algorithm for GANs is proposed and it is shown convergence to an exact solution or to a neighbourhood of it, if the pseudogradient mapping of the game is monotone, and applies to the image generation problem where it observes computational advantages with respect to the extragradient scheme.
Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise
TLDR
This work proves the first high-probability complexity results with logarithmic dependence on the confidence level for stochastic methods for solving monotone and structured non-monotone VIPs with non-sub-Gaussian (heavy-tailed) noise and unbounded domains.
Training Generative Adversarial Networks with Adaptive Composite Gradient
TLDR
The adaptive Composite Gradients (ACG) method is proposed, linearly convergent in bilinear games under suitable settings and is a novel semi-gradient-free algorithm since it does not need to calculate the gradient of each step, reducing the computational cost of gradient and Hessian by utilizing the predictive information in future iterations.
Training GANs with predictive projection centripetal acceleration
TLDR
This work proposes a novel predictive projection centripetal acceleration (PPCA) methods to alleviate the cyclic behaviors of generative adversarial networks.
Generative Adversarial Networks as stochastic Nash games
TLDR
A stochastic relaxed forward-backward algorithm for GANs is proposed and it is shown convergence to an exact solution or to a neighbourhood of it, if the pseudogradient mapping of the game is monotone, and applies to the image generation problem where it observes computational advantages with respect to the extragradient scheme.
Stochastic Extragradient: General Analysis and Improved Rates
TLDR
A novel theoretical framework is developed that allows us to analyze several variants of SEG in a unified manner and outperform the current state-of-the-art convergence guarantees and rely on less restrictive assumptions.
Adversarial Estimation of Riesz Representers
TLDR
This work provides an adversarial approach to estimating Riesz representers of linear functionals within arbitrary function spaces with a plethora of recently introduced machine learning techniques and proves oracle inequalities based on the localized Rademacher complexity of the function space used to approximate the RiesZ representer and the approximation error.
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging
TLDR
The stochastic bilinear minimax optimization problem is studied, an analysis of the same-sample Stochastic ExtraGradient method with constant step size is presented, and variations of the method that yield favorable convergence are presented.
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity
TLDR
The expected co-coercivity condition is introduced, its benefits are explained, and the first last-iterate convergence guarantees of SGDA and SCO under this condition are provided for solving a class of stochastic variational inequality problems that are potentially non-monotone.
Extrapolation for Large-batch Training in Deep Learning
TLDR
This work proposes to use computationally efficient extrapolation (extragradient) to stabilize the optimization trajectory while still benefiting from smoothing to avoid sharp minima, and proves the convergence of this novel scheme and rigorously evaluates its empirical performance on ResNet, LSTM, and Transformer.
...
...

References

SHOWING 1-10 OF 33 REFERENCES
Reducing Noise in GAN Training with Variance Reduced Extragradient
TLDR
A novel stochastic variance-reduced extragradient optimization algorithm, which for a large class of games improves upon the previous convergence rates proposed in the literature.
Training GANs with Optimism
TLDR
This work addresses the issue of limit cycling behavior in training Generative Adversarial Networks and proposes the use of Optimistic Mirror Decent (OMD) for training Wasserstein GANs and introduces a new algorithm, Optimistic Adam, which is an optimistic variant of Adam.
Negative Momentum for Improved Game Dynamics
TLDR
It is proved that alternating updates are more stable than simultaneous updates and both theoretically and empirically that alternating gradient updates with a negative momentum term achieves convergence in a difficult toy adversarial problem, but also on the notoriously difficult to train saturating GANs.
Wasserstein Generative Adversarial Networks
TLDR
This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.
Solving variational inequalities with Stochastic Mirror-Prox algorithm
TLDR
A novel Stochastic Mirror-Prox algorithm is developed for solving s.v.i. variational inequalities with monotone operators and it is shown that with the convenient stepsize strategy it attains the optimal rates of convergence with respect to the problem parameters.
Self-Attention Generative Adversarial Networks
TLDR
The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset.
Unrolled Generative Adversarial Networks
TLDR
This work introduces a method to stabilize Generative Adversarial Networks by defining the generator objective with respect to an unrolled optimization of the discriminator, and shows how this technique solves the common problem of mode collapse, stabilizes training of GANs with complex recurrent generators, and increases diversity and coverage of the data distribution by the generator.
Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization
TLDR
It is shown that OMWU monotonically improves the Kullback-Leibler divergence of the current iterate to the (appropriately normalized) min-max solution until it enters a neighborhood of the solution and becomes a contracting map converging to the exact solution.
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
Dual extrapolation and its applications to solving variational inequalities and related problems
  • Y. Nesterov
  • Mathematics, Computer Science
    Math. Program.
  • 2007
TLDR
This paper shows that with an appropriate step-size strategy, their method is optimal both for Lipschitz continuous operators and for the operators with bounded variations.
...
...