• Corpus ID: 238354203

Stochastic Multiplicative Weights Updates in Zero-Sum Games

  title={Stochastic Multiplicative Weights Updates in Zero-Sum Games},
  author={James P. Bailey and Sai Ganesh Nagarajan and Georgios Piliouras},
We study agents competing against each other in a repeated network zero-sum game while applying the multiplicative weights update (MWU) algorithm with fixed learning rates. In our implementation, agents select their strategies probabilistically in each iteration and update their weights/strategies using the realized vector payoff of all strategies , i.e., stochastic MWU with full information. We show that the system results in an irreducible Markov chain where agent strategies diverge from the… 

Figures and Tables from this paper

How and Why to Manipulate Your Own Agent
This paper proposes to view the outcomes of the agents’ dynamics as inducing a “meta-game” between the users, and proposes a general framework to model and analyze these strategic interactions between users of learning agents for general games and analyze the equilibria induced on the users in three classes of games.
Auctions between Regret-Minimizing Agents
This work analyzes a scenario in which software agents implemented as regret-minimizing algorithms engage in a repeated auction on behalf of their users and shows that, surprisingly, in second-price auctions the players have incentives to misreport their true valuations to their own learning agents.


Multiplicative Weights Update in Zero-Sum Games
If equilibria are indeed predictive even for the benchmark class of zero-sum games, agents in practice must deviate robustly from the axiomatic perspective of optimization driven dynamics as captured by MWU and variants and apply carefully tailored equilibrium-seeking behavioral dynamics.
Chaos, Extremism and Optimism: Volume Analysis of Learning in Games
Two novel, rather negative properties of MWU in zero-sum games are proved: Extremism: even in games with unique fully mixed Nash equilibrium, the system recurrently gets stuck near pure-strategy profiles, despite them being clearly unstable from game theoretic perspective and Unavoidability: the system cannot avoid bad points indefinitely.
Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games
It is proved that no meaningful prediction can be made about the day-to-day behavior of online learning dynamics in zero-sum games, and Chaos is robust to all affine variants of zero- sum games, network variants with arbitrary large number of agents and even to competitive settings beyond these.
Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes
We show for the first time, to our knowledge, that it is possible to reconcile in online learning in zero-sum games two seemingly contradictory objectives: vanishing time-average regret and
Convergence of probability measures
The author's preface gives an outline: "This book is about weakconvergence methods in metric spaces, with applications sufficient to show their power and utility. The Introduction motivates the
The Multiplicative Weights Update Method: a Meta-Algorithm and Applications
A simple meta-algorithm is presented that unifies many of these disparate algorithms and derives them as simple instantiations of the meta-Algorithm.