Mirror descent in saddle-point problems: Going the extra (gradient) mile
- P. Mertikopoulos, Houssam Zenati, Bruno Lecouat, Chuan-Sheng Foo, V. Chandrasekhar, G. Piliouras
- Computer ScienceInternational Conference on Learning…
- 7 July 2018
This work analyzes the behavior of mirror descent in a class of non-monotone problems whose solutions coincide with those of a naturally associated variational inequality-a property which it is called coherence, and shows that optimistic mirror descent (OMD) converges in all coherent problems.
The Unusual Effectiveness of Averaging in GAN Training
- Yasin Yazici, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, G. Piliouras, V. Chandrasekhar
- Computer ScienceInternational Conference on Learning…
- 12 June 2018
It is shown that EMA converges to limit cycles around the equilibrium with vanishing amplitude as the discount parameter approaches one for simple bilinear games and also enhances the stability of general GAN training.
Cycles in adversarial regularized learning
- P. Mertikopoulos, C. Papadimitriou, G. Piliouras
- Computer ScienceACM-SIAM Symposium on Discrete Algorithms
- 8 September 2017
It is shown that the system's behavior is Poincare recurrent, implying that almost every trajectory revisits any (arbitrarily small) neighborhood of its starting point infinitely often.
First-order Methods Almost Always Avoid Saddle Points
- J. Lee, Ioannis Panageas, G. Piliouras, Max Simchowitz, Michael I. Jordan, B. Recht
- Computer Science, MathematicsNeural Information Processing Systems
- 20 October 2017
It is established that first-order methods avoid saddle points for almost all initializations, and neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoiding saddle points.
Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
- Stefanos Leonardos, W. Overman, Ioannis Panageas, G. Piliouras
- Computer ScienceInternational Conference on Learning…
- 3 June 2021
A novel definition of Markov Potential Games (MPG) is presented that generalizes prior attempts at capturing complex stateful multiagent coordination and proves (polynomially fast in the approximation error) convergence of independent policy gradient to Nash policies by adapting recent gradient dominance property arguments developed for single agent MDPs to multi-agent learning settings.
First-order methods almost always avoid strict saddle points
- J. Lee, Ioannis Panageas, G. Piliouras, Max Simchowitz, Michael I. Jordan, B. Recht
- Computer Science, MathematicsMathematical programming
- 1 July 2019
It is established that first-order methods avoid strict saddle points for almost all initializations, and neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoid strict Saddle points.
α-Rank: Multi-Agent Evaluation by Evolution
- Shayegan Omidshafiei, C. Papadimitriou, R. Munos
- EconomicsScientific Reports
- 9 July 2019
We introduce α-Rank, a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic…
Gradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions
- Ioannis Panageas, G. Piliouras
- MathematicsInformation Technology Convergence and Services
- 2 May 2016
It is proved that the set of initial conditions so that gradient descent converges to saddle points where f has at least one strictly negative eigenvalue has (Lebesgue) measure zero, even for cost functions f with non-isolated critical points, answering an open question in [12].
Multiplicative Weights Update in Zero-Sum Games
- James P. Bailey, G. Piliouras
- EconomicsACM Conference on Economics and Computation
- 11 June 2018
If equilibria are indeed predictive even for the benchmark class of zero-sum games, agents in practice must deviate robustly from the axiomatic perspective of optimization driven dynamics as captured by MWU and variants and apply carefully tailored equilibrium-seeking behavioral dynamics.
Multiplicative updates outperform generic no-regret learning in congestion games: extended abstract
- Robert D. Kleinberg, G. Piliouras, É. Tardos
- EconomicsSymposium on the Theory of Computing
- 31 May 2009
The results show that natural learning behavior can avoid bad outcomes predicted by the price of anarchy in atomic congestion games such as the load-balancing game introduced by Koutsoupias and Papadimitriou, which has super-constant price of Anarchy and has correlated equilibria that are exponentially worse than any mixed Nash equilibrium.
...
...