Corpus ID: 201666696

OpenSpiel: A Framework for Reinforcement Learning in Games

@article{Lanctot2019OpenSpielAF,
  title={OpenSpiel: A Framework for Reinforcement Learning in Games},
  author={Marc Lanctot and Edward Lockhart and Jean-Baptiste Lespiau and Vin{\'i}cius Flores Zambaldi and Satyaki Upadhyay and Julien P{\'e}rolat and Sriram Srinivasan and Finbarr Timbers and Karl Tuyls and Shayegan Omidshafiei and Daniel Hennes and Dustin Morrill and Paul Muller and Timo Ewalds and Ryan Faulkner and J{\'a}nos Kram{\'a}r and Bart De Vylder and Brennan Saeta and James Bradbury and David Ding and Sebastian Borgeaud and Matthew Lai and Julian Schrittwieser and Thomas W. Anthony and Edward Hughes and Ivo Danihelka and Jonah Ryan-Davis},
  journal={ArXiv},
  year={2019},
  volume={abs/1908.09453}
}
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partially- and fully- observable) grid worlds and social dilemmas. OpenSpiel also includes tools to… Expand
RLCard: A Toolkit for Reinforcement Learning in Card Games
TLDR
An overview of the key components in RLCard is provided, a discussion of the design principles, a brief introduction of the interfaces, and comprehensive evaluations of the environments are provided. Expand
A Generalized Training Approach for Multiagent Learning
TLDR
This paper extends the theoretical underpinnings of PSRO by considering an alternative solution concept, $\alpha$-Rank, which is unique (thus faces no equilibrium selection issues, unlike Nash) and applies readily to general-sum, many-player settings, and establishes convergence guarantees in several games classes. Expand
Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games
TLDR
This work introduces a framework, LMAC, based on meta-gradient descent that automates the discovery of the update rule without explicit human design and is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO on Leduc Poker. Expand
Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
TLDR
JPSRO is proposed, an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium and a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. Expand
Solving Common-Payoff Games with Approximate Policy Iteration
TLDR
This work proposes CAPI, a novel algorithm which, like BAD, combines common knowledge with deep reinforcement learning, however, unlike BAD, CAPI prioritizes the propensity to discover optimal joint policies over scalability, which precludes CAPI from scaling to games as large as Hanabi. Expand
DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
  • D. Zha, Jingru Xie, +4 authors Ji Liu
  • Computer Science
  • ICML
  • 2021
TLDR
Through building DouZero, it is shown that classic Monte-Carlo methods can be made to deliver strong results in a hard domain with a complex action space, and the code and an online demo are released with the hope that this insight could motivate future work. Expand
Fast computation of Nash Equilibria in Imperfect Information Games
TLDR
A class of algorithms for computing Nash equilibria in two-player zero-sum games, called Mirror Ascent against an Improved Opponent (MAIO), is introduced and it is shown that the speed of convergence depends on the amount of improvement offered by these improved policies. Expand
Boltzmann Distributed Replicator Dynamics: Population Games in a Microgrid Context
TLDR
A distributed control method of learning that allows analyzing the effect of the exploration concept in MAS and shows that despite the lack of full information of the system, by controlling some parameters of the method, it has similar behavior to the traditional centralized approaches. Expand
Human-Agent Cooperation in Bridge Bidding
We introduce a human-compatible reinforcement-learning approach to a cooperative game, making use of a third-party hand-coded human-compatible bot to generate initial training data and to performExpand
Mava: a research framework for distributed multi-agent reinforcement learning
TLDR
Mava is presented: a research framework specifically designed for building scalable MARL systems and provides useful components, abstractions, utilities and tools for MARL and allows for simple scaling for multi-process system training and execution, while providing a high level of flexibility and composability. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 101 REFERENCES
Markov Games as a Framework for Multi-Agent Reinforcement Learning
TLDR
A Q-learning-like algorithm for finding optimal policies and its application to a simple two-player game in which the optimal policy is probabilistic is demonstrated. Expand
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
TLDR
An algorithm is described, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection, which generalizes previous ones such as InRL. Expand
Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration
TLDR
This work derives and studies an idealization of Q-learning in 2-player 2-action repeated general-sum games, and addresses the discontinuous case of e-greedy exploration and uses it as a proxy for value-based algorithms to highlight a contrast with existing results in policy search. Expand
The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)
TLDR
The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed. Expand
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
TLDR
This paper introduces the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge, and combines fictitious self-play with deep reinforcement learning. Expand
Open-ended Learning in Symmetric Zero-sum Games
TLDR
A geometric framework for formulating agent objectives in zero-sum games is introduced, and a new algorithm (rectified Nash response, PSRO_rN) is developed that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. Expand
Rethinking Formal Models of Partially Observable Multiagent Decision Making
TLDR
This paper proves that any timeable perfect-recall EFG can be efficiently modeled as a FOG as well as relating FOGs to other existing formalisms, and presents the two building-blocks of these breakthroughs --- counterfactual regret minimization and public state decomposition in the new formalism. Expand
A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics
TLDR
A new MARL algorithm called the Weighted Policy Learner (WPL), which allows agents to reach a Nash Equilibrium (NE) in benchmark 2-player-2-action games with minimum knowledge and outperforms the state-of-the-art algorithms in a more realistic setting of 100 agents interacting and learning concurrently. Expand
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
TLDR
This work analyzes the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network on two Markov games and characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance. Expand
Evolutionary Dynamics of Multi-Agent Learning: A Survey
TLDR
This article surveys the dynamical models that have been derived for various multi-agent reinforcement learning algorithms, making it possible to study and compare them qualitatively, and provides a roadmap on the progress that has been achieved in analysing the evolutionary dynamics of multi- agent learning. Expand
...
1
2
3
4
5
...