Deep Counterfactual Regret Minimization
@article{Brown2018DeepCR, title={Deep Counterfactual Regret Minimization}, author={Noam Brown and Adam Lerer and Sam Gross and Tuomas Sandholm}, journal={ArXiv}, year={2018}, volume={abs/1811.00164} }
Counterfactual Regret Minimization (CFR) is the leading framework for solving large imperfect-information games. [] Key Result This is the first non-tabular variant of CFR to be successful in large games.
137 Citations
SINGLE DEEP COUNTERFACTUAL REGRET MINIMIZA-
- Computer Science
- 2019
Single Deep CFR is introduced, a variant of Deep CFR that has a lower overall approximation error by avoiding the training of an average strategy network and is more attractive from a theoretical perspective and empirically outperforms Deep CFR with respect to exploitability and one-on-one play in poker.
NNCFR: Minimize Counterfactual Regret with Neural Networks
- Computer ScienceArXiv
- 2021
This paper introduces Neural Network Counterfactual Regret Minimization (NNCFR), an improved variant of Deep CFR that has a faster convergence by constructing a dueling netwok as the value network and a new loss function is designed in the procedure of training policy network in the proposed NNCFR, which can be good to make the policy network more stable.
Single Deep Counterfactual Regret Minimization
- Computer ScienceArXiv
- 2019
Single Deep CFR is introduced, a simplified variant of Deep CFR that has a lower overall approximation error by avoiding the training of an average strategy network and is more attractive from a theoretical perspective and empirically outperforms Deep CFR with respect to exploitability and one-on-one play in poker.
DOUBLE NEURAL COUNTERFACTUAL REGRET MINI-
- Computer Science
- 2019
This paper proposes a double neural representation for the IIGs, where one neural network represents the cumulative regret, and the other represents the average strategy, and achieves strong performance while using hundreds of times less memory than the tabular CFR.
IMIZATION FOR EXTENSIVE GAMES WITH IMPERFECT INFORMATION
- Computer Science
- 2019
Lazy-CFR, a CFR algorithm that adopts a lazy update strategy to avoid traversing the whole game tree in each round, is presented and it is proved that the regret of Lazy- CFR is almost the same as the regrets of the vanilla CFR and only needs to visit a small portion of the game tree.
Solving imperfect-information games via exponential counterfactual regret minimization
- Computer ScienceArXiv
- 2020
This paper proposes a novel CFR based method, exponential counterfactual regret minimization, and presents an exponential reduction technique for regret in the process of the iteration, and proves that the method ECFR has a good theoretical guarantee of convergence.
Scalable sub-game solving for imperfect-information games
- Computer Science, EconomicsKnowl. Based Syst.
- 2021
Bounds for Approximate Regret-Matching Algorithms
- Computer ScienceArXiv
- 2019
This paper gives regret bounds when a regret minimizing algorithm uses estimates instead of true values, the first to generalize to a larger class of $(\Phi, f)$-regret matching algorithms, and includes different forms of regret such as swap, internal, and external regret.
RLCFR: Minimize Counterfactual Regret by Deep Reinforcement Learning
- Computer ScienceExpert Syst. Appl.
- 2022
Acquiring Strategies for the Board Game Geister by Regret Minimization
- Computer Science2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)
- 2019
This paper proposes a variant of Deep CFR for board games and shows that the proposed method can train agents with an appropriate strategy and applies it to the game Geister.
55 References
Single Deep Counterfactual Regret Minimization
- Computer ScienceArXiv
- 2019
Single Deep CFR is introduced, a simplified variant of Deep CFR that has a lower overall approximation error by avoiding the training of an average strategy network and is more attractive from a theoretical perspective and empirically outperforms Deep CFR with respect to exploitability and one-on-one play in poker.
Solving Imperfect-Information Games via Discounted Regret Minimization
- Computer ScienceAAAI
- 2019
This paper introduces novel CFR variants that 1) discount regrets from earlier iterations in various ways, 2) reweight iterations inVarious ways to obtain the output strategies, 3) use a non-standard regret minimizer and/or 4) leverage "optimistic regret matching".
Using Regret Estimation to Solve Games Compactly
- Computer Science
- 2016
It is suggested that such abstractions can be largely subsumed by a regressor on game features that estimates regret during CFR, and the regressor essentially becomes a tunable, compact, and dynamic abstraction of abstractions.
Double Neural Counterfactual Regret Minimization
- Computer ScienceICLR
- 2020
This paper proposes a double neural representation for the imperfect information games, where one neural network represents the cumulative regret, and the other represents the average strategy, and adopts the counterfactual regret minimization algorithm to optimize this double neural representations.
Stable-Predictive Optimistic Counterfactual Regret Minimization
- Computer ScienceICML
- 2019
This work presents the first CFR variant that breaks the square-root dependence on iterations, and shows that this method is faster than the original CFR algorithm, although not as fast as newer variants, in spite of their worst-case $O(T^{-1/2})$ dependence on iteration.
Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization
- Computer ScienceAAMAS
- 2012
This work presents new sampling techniques that consider sets of chance outcomes during each traversal to produce slower, more accurate iterations of Counterfactual Regret Minimization, and demonstrates that this new CFR update converges more quickly than chance-sampled CFR in the large domains of poker and Bluff.
Monte Carlo Sampling for Regret Minimization in Extensive Games
- Computer ScienceNIPS
- 2009
A general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) is described, of which the original and poker-specific versions are special cases.
Time and Space: Why Imperfect Information Games are Hard
- Computer Science
- 2018
The thesis introduces an analysis of counterfactual regret minimisation (CFR), an algorithm for solving extensive-form games, and presents tighter regret bounds that describe the rate of progress, as well as presenting a series of theoretical tools for using decomposition, and creating algorithms which operate on small portions of a game at a time.
Solving Large Sequential Games with the Excessive Gap Technique
- Computer ScienceNeurIPS
- 2018
It is shown that a particular first-order method, a state-of-the-art variant of the excessive gap technique---instantiated with the dilated entropy distance function---can efficiently solve large real-world problems competitively with CFR and its variants.
Regret Minimization in Games with Incomplete Information
- Computer Science, EconomicsNIPS
- 2007
It is shown how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium, and is demonstrated in the domain of poker, showing it can solve abstractions of limit Texas Hold'em with as many as 1012 states, two orders of magnitude larger than previous methods.