Provable Self-Play Algorithms for Competitive Reinforcement Learning
@inproceedings{Bai2020ProvableSA, title={Provable Self-Play Algorithms for Competitive Reinforcement Learning}, author={Yu Bai and C. Jin}, booktitle={ICML}, year={2020} }
Self-play, where the algorithm learns by playing against itself without requiring any direct supervision, has become the new weapon in modern Reinforcement Learning (RL) for achieving superhuman performance in practice. However, the majority of exisiting theory in reinforcement learning only applies to the setting where the agent plays against a fixed environment; it remains largely open whether self-play algorithms can be provably effective, especially when it is necessary to manage the… CONTINUE READING
Tables and Topics from this paper
Tables
12 Citations
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
- Computer Science, Mathematics
- ArXiv
- 2020
- 1
- PDF
Provably Efficient Online Agnostic Learning in Markov Games
- Computer Science, Mathematics
- ArXiv
- 2020
- Highly Influenced
- PDF
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
- Computer Science, Mathematics
- NeurIPS
- 2020
- 6
- PDF
Independent Policy Gradient Methods for Competitive Reinforcement Learning
- Computer Science
- NeurIPS
- 2020
- 1
- PDF
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
- Computer Science, Mathematics
- ArXiv
- 2020
- 4
- PDF
Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory
- Computer Science, Mathematics
- ArXiv
- 2020
- PDF
Dynamic Regret of Policy Optimization in Non-stationary Environments
- Computer Science, Mathematics
- NeurIPS
- 2020
- 4
- PDF
References
SHOWING 1-10 OF 46 REFERENCES
Learning to compete, compromise, and cooperate in repeated general-sum games
- Mathematics, Computer Science
- ICML '05
- 2005
- 47
- PDF
Nash Q-Learning for General-Sum Stochastic Games
- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2003
- 768
- PDF
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
- Computer Science, Mathematics
- ArXiv
- 2019
- 90
- PDF
Corruption Robust Exploration in Episodic Reinforcement Learning
- Computer Science, Mathematics
- ArXiv
- 2019
- 17
- PDF
R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2002
- 1,046
- Highly Influential
- PDF