• Corpus ID: 219792002

Competitive Mirror Descent

@article{Schafer2020CompetitiveMD,
  title={Competitive Mirror Descent},
  author={Florian Schafer and Anima Anandkumar and Houman Owhadi},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.10179}
}
Constrained competitive optimization involves multiple agents trying to minimize conflicting objectives, subject to constraints. This is a highly expressive modeling language that subsumes most of modern machine learning. In this work we propose competitive mirror descent (CMD): a general method for solving such problems based on first order information that can be obtained by automatic differentiation. First, by adding Lagrange multipliers, we obtain a simplified constraint set with an… 

Figures and Tables from this paper

Polymatrix Competitive Gradient Descent

TLDR
Polymatrix competitive gradient descent is proposed as a method for solving general sum competitive optimization involving arbitrary numbers of agents and it is proved local convergence of PCGD to stable fixed points for n-player general-sum games, and that it does not require adapting the step size to the strength of the player-interactions.

Robust Reinforcement Learning: A Constrained Game-theoretic Approach

TLDR
This work proposes a game theoretic framework for robust reinforcement learning that comprises many previous works as special cases and formulate robust RL as a constrained minimax game between the RL agent and an environmental agent which represents uncertainties such as model parameter variations and adversarial disturbances.

Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax Optimization

TLDR
The Lifted Primal-Dual (LPD) method is devised, which lifts the objective into an extended form that allows both the smooth terms and the bilinear term to be handled optimally and seamlessly with the same primal-dual framework.

Lyapunov Exponents for Diversity in Differentiable Games

TLDR
Theoretical motivation for the method is given by leveraging machinery from the field of dynamical systems, and it is empirically evaluated by finding diverse solutions in the iterated prisoners’ dilemma and relevant machine learning problems including generative adversarial networks.

Complex Momentum for Optimization in Games

TLDR
It is empirically demon-strate that complex-valued momentum can improve convergence in realistic adversarial games—like generative adversarial networks— by showing better solutions with an almost identical computational cost.

Complex Momentum for Learning in Games

TLDR
It is empirically demonstrate that complex-valued momentum can improve convergence in adversarial games—like generative adversarial networks—by showing it can find better solutions with an almost identical computational cost.

COLA: Consistent Learning with Opponent-Learning Awareness

Learning in general-sum games is unstable and frequently leads to socially undesirable (Pareto-dominated) outcomes. To mitigate this, Learning with Opponent-Learning Awareness (LOLA) introduced

References

SHOWING 1-10 OF 32 REFERENCES

Two-Player Games for Efficient Non-Convex Constrained Optimization

TLDR
It is proved that this proxy-Lagrangian formulation, instead of having unbounded size, can be taken to be a distribution over no more than m+1 models (where m is the number of constraints), which is a significant improvement in practical terms.

Reinforcement Learning with Convex Constraints

TLDR
This paper proposes an algorithmic scheme that can handle any constraints that require expected values of some vector measurements to lie in a convex set, and matches previous algorithms that enforce safety via constraints, but can also enforce new properties that these algorithms do not incorporate, such as diversity.

Global Convergence to the Equilibrium of GANs using Variational Inequalities

TLDR
This work uses the framework of Variational Inequalities to analyze popular training algorithms for a fundamental GAN variant: the Wasserstein Linear-Quadratic GAN and shows that the steepest descent direction causes divergence from the equilibrium, and convergence to the equilibrium is achieved through following a particular orthogonal direction.

Differentiable Game Mechanics

TLDR
New tools to understand and control the dynamics in n-player differentiable games are developed and basic experiments show SGA is competitive with recently proposed algorithms for finding stable fixed points in GANs -- while at the same time being applicable to, and having guarantees in, much more general cases.

The Mechanics of n-Player Differentiable Games

TLDR
The key result is to decompose the second-order dynamics into two components, related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.

Competitive Gradient Descent

We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games. Our method is a natural generalization of gradient descent to the two-player setting

On the convergence properties of the projected gradient method for convex optimization

When applied to an unconstrained minimization problem with a convex objective, the steepest descent method has stronger convergence properties than in the noncovex case: the whole sequence converges

A Lagrangian Method for Inverse Problems in Reinforcement Learning

We cast inverse problems in reinforcement learning as nonlinear equalityconstrained programs and propose a new game-theoretic solution method. Our approach is based on the saddle-point problem

Constrained Policy Optimization

TLDR
Constrained Policy Optimization (CPO) is proposed, the first general-purpose policy search algorithm for constrained reinforcement learning with guarantees for near-constraint satisfaction at each iteration, and allows us to train neural network policies for high-dimensional control while making guarantees about policy behavior all throughout training.

Optimizing Generalized Rate Metrics with Three Players

TLDR
This work extends previous two-player game approaches for constrained optimization to an approach with three players to decouple the classifier rates from the non-linear objective, and seek to find an equilibrium of the game.