DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
@inproceedings{Zha2021DouZeroMD, title={DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning}, author={Daochen Zha and Jingru Xie and Wenye Ma and Sheng Zhang and Xiangru Lian and Xia Hu and Ji Liu}, booktitle={International Conference on Machine Learning}, year={2021} }
Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents. While significant achievements have been made in various perfect- and imperfect-information games, DouDizhu (a.k.a. Fighting the Landlord), a three-player card game, is still unsolved. DouDizhu is a very challenging domain with competition, collaboration, imperfect information, large state space, and particularly a massive set of possible actions where the legal actions vary…
Figures and Tables from this paper
figure 1 table 1 figure 2 table 2 figure 3 table 3 figure 4 table 4 figure 5 table 5 figure 6 table 6 figure 7 table 7 figure 8 table 8 figure 9 table 9 figure 10 figure 11 figure 12 figure 13 table 14 table 15 figure 16 figure 17 figure 18 figure 19 table 19 figure 20 table 20 figure 21 figure 22 figure 23 figure 24 figure 25 figure 26 figure 27 figure 28 figure 29 figure 30 figure 31 figure 32 figure 33
47 Citations
DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning
- Computer Science2022 IEEE Conference on Games (CoG)
- 2022
The integration of the above two techniques into DouZero, the DouDizhu AI system achieves better performance and ranks top in the Botzone leaderboard among more than 400 AI agents, including DouZero.
Deep Reinforcement Learning for Two-Player DouDizhu
- Computer Science2022 Euro-Asia Conference on Frontiers of Computer Science and Information Technology (FCSIT)
- 2022
This paper implements and improves DouZero system on two-player DouDizhu, a variant of the classic DouDuzhu, where there is no cooperation between the players yet with more hidden information, and designs filter network based on supervised learning to improve the quality of training data and thus accelerate the training process.
DanZero: Mastering GuanDan Game with Reinforcement Learning
- Computer ScienceArXiv
- 2022
This paper proposes the first AI program DanZero for GuanDan using reinforcement learning technique, utilizing a distributed framework to train the AI system and reveals the outstanding performance of DanZero.
PerfectDou: Dominating DouDizhu with Perfect Information Distillation
- Computer ScienceNeurIPS
- 2022
This paper proposes PerfectDou, a state-of-the-art DouDizhu AI system that dominates the game, in an actor-critic framework with a proposed technique named perfect information distillation that allows the agents to utilize the global information to guide the training of the policies as if it is a perfect information game.
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games
- Computer ScienceArXiv
- 2022
The empirical results demonstrate that the policies found by many existing methods including Neural Fictitious Self Play and Policy Space Response Oracle can be prone to exploitation by adversarial opponents, and the output policies of the proposed algorithms are robust to exploitation, and thus outperform existing methods.
Speedup Training Artificial Intelligence for Mahjong via Reward Variance Reduction
- Computer Science2022 IEEE Conference on Games (CoG)
- 2022
Results show that RVR significantly reduces the variance in Mahjong AI training and improves the model performance, as well as improving the training stability using an expected reward network to adapt to the complex, dynamic, and highly stochastic reward environment.
Hierarchical Architecture for Multi-Agent Reinforcement Learning in Intelligent Game
- Computer Science2022 International Joint Conference on Neural Networks (IJCNN)
- 2022
A hierarchical architecture learning paradigm that methodologically combines the multi- agent algorithm and single-agent algorithm in multi-agent environments is proposed and macro-operation is introduced to reduce the original action space, while skillfully mitigating the scalability issue.
Distributed Deep Reinforcement Learning: A Survey and A Multi-Player Multi-Agent Learning Toolbox
- Computer ScienceArXiv
- 2022
A multi-player multi-agent distributed deep reinforcement learning toolbox is developed and released, and validated on Wargame, a complex environment, showing usability of the proposed toolbox for multiple players and multiple agents distributedDeep reinforcement learning under complex games.
More Like Real World Game Challenge for Partially Observable Multi-Agent Cooperation
- Computer Science
- 2023
The WGC is a lightweight, flexible, and easy-to-use environment with a clear framework that can be easily configured by users and introduces more challenges that better reflect the real-world characteristics.
TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play
- Computer ScienceAAMAS
- 2023
This paper develops a multi-agent system to play the full 11 vs. 11 game mode, without demonstrations, and introduces several innovations, including adaptive curriculum learning, a novel self-play strategy, and an objective that optimizes the policies of multiple agents jointly.
60 References
Suphx: Mastering Mahjong with Deep Reinforcement Learning
- Computer ScienceArXiv
- 2020
An AI for Mahjong is designed, named Suphx, based on deep reinforcement learning with some newly introduced techniques including global reward prediction, oracle guiding, and run-time policy adaptation, which is the first time that a computer program outperforms most top human players in Mahjong.
Combinational Q-Learning for Dou Di Zhu
- Computer ScienceAAAI 2019
- 2019
This paper proposes a novel method to handle combinatorial actions, which it is called combinational Q-learning (CQL), and employs a two-stage network to reduce action space and also leverage order-invariant max-pooling operations to extract relationships between primitive actions.
DeltaDou: Expert-level Doudizhu AI through Self-play
- Computer ScienceIJCAI
- 2019
The results show that self-play can significantly improve the performance of the agent in this multiagent imperfect information game Doudizhu and even starting with a weak AI, the agent can achieve human expert level after days of self- play and training.
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
- Computer ScienceNIPS
- 2017
An algorithm is described, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection, which generalizes previous ones such as InRL.
Mastering Atari, Go, chess and shogi by planning with a learned model
- Computer ScienceNature
- 2020
The MuZero algorithm is presented, which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics.
Mastering Complex Control in MOBA Games with Deep Reinforcement Learning
- Computer ScienceAAAI
- 2020
A deep reinforcement learning framework to tackle the problem of complex action control in the Multi-player Online Battle Arena (MOBA) 1v1 games is presented, which is of low coupling and high scalability, which enables efficient explorations at large scale.
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
- Computer ScienceArXiv
- 2016
This paper introduces the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge, and combines fictitious self-play with deep reinforcement learning.
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
- Computer ScienceNeurIPS
- 2020
Results show ReBeL leads to low exploitability in benchmark imperfect-information games and achieves superhuman performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI.
Towards Playing Full MOBA Games with Deep Reinforcement Learning
- Computer ScienceNeurIPS
- 2020
This paper proposes a MOBA AI learning paradigm that methodologically enables playing full MOBA games with deep reinforcement learning, and develops a combination of novel and existing learning techniques, including curriculum self-play learning, policy distillation, off-policy adaption, multi-head value estimation, and Monte-Carlo tree-search.
Deep Reinforcement Learning with Double Q-Learning
- Computer ScienceAAAI
- 2016
This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.