# Optimality and Stability in Non-Convex-Non-Concave Min-Max Optimization

@article{Zhang2020OptimalityAS, title={Optimality and Stability in Non-Convex-Non-Concave Min-Max Optimization}, author={Guojun Zhang and Pascal Poupart and Yaoliang Yu}, journal={ArXiv}, year={2020}, volume={abs/2002.11875} }

Convergence to a saddle point for convex-concave functions has been studied for decades, while the last few years have seen a surge of interest in non-convex-non-concave min-max optimization due to the rise of deep learning. However, it remains an intriguing research challenge how local optimal points are defined and which algorithm can converge to such points. We study definitions of “local min-max (max-min)” points and provide an elegant unification, with the corresponding firstand second… Expand

#### 7 Citations

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

- 2021

We study gradient descent-ascent learning dynamics with timescale separation in unconstrained continuous action zero-sum games where the minimizing player faces a nonconvex optimization problem and… Expand

The Landscape of the Proximal Point Method for Nonconvex-Nonconcave Minimax Optimization

- 2020

Minimax optimization has become a central tool for modern machine learning with applications in generative adversarial networks, robust optimization, reinforcement learning, etc. These applications… Expand

Minimax Optimization with Smooth Algorithmic Adversaries

- Computer Science
- ArXiv
- 2021

A new algorithm is proposed for the min-player to play against smooth algorithms deployed by the adversary instead of against full maximization, guaranteed to make monotonic progress, and to find an appropriate “stationary point” in a polynomial number of iterations. Expand

A Minimax Theorem for Nonconcave-Nonconvex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

- Computer Science
- 2020

The main contribution is to provide an approximate minimax theorem for a large class of games where the players pick neural networks including WGAN, StarCraft II, and Blotto Game, which relies on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, these games are concave-conveX withrespect to the actual models represented by these neural networks. Expand

Newton-type Methods for Minimax Optimization

- Computer Science, Mathematics
- ArXiv
- 2020

This work provides a detailed analysis of existing algorithms and relates them to two novel Newton-type algorithms that converge faster to (strict) local minimax points and are much more effective when the problem is ill-conditioned. Expand

A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

- Computer Science
- AISTATS
- 2021

The main contribution is to provide an approximate minimax theorem for a large class of games where the players pick neural networks including WGAN, StarCraft II and Blotto Game, which relies on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, these games are concave-conveX with respectto the actual models represented by these neural networks. Expand

On the Suboptimality of Negative Momentum for Minimax Optimization

- Mathematics, Computer Science
- AISTATS
- 2021

It is shown that negative momentum accelerates convergence of game dynamics locally, though with a suboptimal rate, which is the first work that provides an explicit convergence rate for negative momentum in this setting. Expand

#### References

SHOWING 1-10 OF 60 REFERENCES

The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization

- Mathematics, Computer Science
- NeurIPS
- 2018

This work characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradients Descent Ascent (OGDA), and shows that both dynamics avoid unstable critical points for almost all initializations. Expand

Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal

- Computer Science, Mathematics
- ArXiv
- 2019

It is shown that as the ratio of the ascent step size to the descent step size goes to infinity, stable limit points of GDA are exactly local minmax points up to degenerate points, demonstrating that all stable limitpoints of G DA have a game-theoretic meaning for minmax problems. Expand

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach

- Computer Science, Mathematics
- ICLR
- 2020

Theoretically, the proposed Follow-the-Ridge (FR) algorithm addresses the notorious rotational behaviour of gradient dynamics, and is compatible with preconditioning and positive momentum, and improves the convergence of GAN training compared to the recent minimax optimization algorithms. Expand

Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games

- Computer Science, Mathematics
- NeurIPS
- 2019

It is established theoretically, that depending on the specific instance of the problem gradient-descent-ascent dynamics can exhibit a variety of behaviors antithetical to convergence to the game theoretically meaningful min-max solution. Expand

Mirror descent in saddle-point problems: Going the extra (gradient) mile

- Computer Science, Mathematics
- ICLR
- 2019

This work analyzes the behavior of mirror descent in a class of non-monotone problems whose solutions coincide with those of a naturally associated variational inequality-a property which it is called coherence, and shows that optimistic mirror descent (OMD) converges in all coherent problems. Expand

SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation

- Computer Science
- ICML
- 2018

This paper revisits the Bellman equation, and reformulate it into a novel primal-dual optimization problem using Nesterov’s smoothing technique and the Legendre-Fenchel transformation, and develops a new algorithm, called Smoothed Bellman Error Embedding, to solve this optimization problem where any differentiable function class may be used. Expand

Gradient descent GAN optimization is locally stable

- Computer Science, Mathematics
- NIPS
- 2017

This paper analyzes the "gradient descent" form of GAN optimization i.e., the natural setting where the authors simultaneously take small gradient steps in both generator and discriminator parameters, and proposes an additional regularization term for gradient descent GAN updates that is able to guarantee local stability for both the WGAN and the traditional GAN. Expand

Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks

- Computer Science, Mathematics
- AISTATS
- 2019

A simple yet unified non-asymptotic local convergence theory for smooth two-player games, which subsumes several discrete-time gradient-based saddle point dynamics and reveals the surprising nature of the off-diagonal interaction term. Expand

Stochastic Variance Reduction Methods for Policy Evaluation

- Computer Science, Mathematics
- ICML
- 2017

This paper first transforms the empirical policy evaluation problem into a (quadratic) convex-concave saddle point problem, and then presents a primal-dual batch gradient method, as well as two stochastic variance reduction methods for solving the problem. Expand

Semi-Infinite Programming: Theory, Methods, and Applications

- Computer Science, Mathematics
- SIAM Rev.
- 1993

This paper treats numerical methods based on either discretization or local reduction with the emphasis on the design of superlinearly convergent (SQP-type) methods. Expand