• Corpus ID: 2560029

Regret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents

  title={Regret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents},
  author={Richard G. Gibson},
In two-player zero-sum games, if both players minimize their average external regret, then the average of the strategy profiles converges to a Nash equilibrium. For n-player general-sum games, however, theoretical guarantees for regret minimization are less understood. Nonetheless, Counterfactual Regret Minimization (CFR), a popular regret minimization algorithm for extensive-form games, has generated winning three-player Texas Hold'em agents in the Annual Computer Poker Competition (ACPC). In… 

Figures and Tables from this paper

Computing Strong Game-Theoretic Strategies and Exploiting Suboptimal Opponents in Large Games

This work proposes a new paradigm in which relevant portions of the game are solved in real time in much finer degrees of granularity than the abstract game which is solved offline, enabling us to solve games with significantly less abstraction for the initial betting rounds.


A variant of the CFR algorithm is designed (called CFR-Jr) which approaches the set of CCEs with a regret bound sub-linear in the size of the game, and is shown to be dramatically faster than CFR-S and the state-of-the-art algorithms to compute C CEs.

Solving zero-sum extensive-form games with arbitrary payoff uncertainty models

This work proposes a method, Harsanyi-Counterfactual Regret Minimization, to solve two-player zero-sum extensive-form games with arbitrary payoff distribution models, and addresses the problem of arbitrary continuous payoff distributions.

Learning to Correlate in Multi-Player General-Sum Sequential Games

Experiments on a rich testbed of multi-player, general-sum sequential games show that both CFR-S and CFR-Jr are dramatically faster than the state-of-the-art algorithms to compute CCEs, with CFR- Jr being also a good heuristic to find socially-optimal C CEs.

Scalable sub-game solving for imperfect-information games

What's in a game: game-theoretic analysis for third party planning

It is demonstrated how counterfactual regret minimization can assist third party planners under these circumstances.

Designing Learning Algorithms over the Sequence Form of an Extensive-Form Game

This paper shows that some learning algorithms defined over the normal form can be re-defined over the sequence form so that the dynamics of the two algorithms are realization equivalent, which allows an exponential compression of the representation and therefore makes such algorithms employable in practice.

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

This work considers Diplomacy, a 7-player board game designed to accentuate dilemmas resulting from many-agent interactions, and proposes a simple yet effective approximate best response operator, designed to handle large combinatorial action spaces and simultaneous moves.

State of the Art on : Regret Minimization for Non-Cooperative Games

Algorithmic Game Theory provides the mathematical models to describe the problems and their solutions, in the form of game representations and equilibria concepts, to either solve those problems in an efficient way or to prove their difficulty to be solved.

Further developments of extensive-form replicator dynamics using the sequence-form representation

It is shown that sequence-form constraints and realization equivalence to standard replicator dynamics are maintained in general n-player games and can minimize regret, leading to equilibrium convergence guarantees in two-player zero-sum games.

Using counterfactual regret minimization to create competitive multiplayer poker agents

It is believed that CFR-generated agents may perform well in multiplayer games, and it is demonstrated that good strategies can be obtained by grafting sets of two-player subgame strategies to a 3-player base strategy after one of the players is eliminated.

Regret Minimization in Games with Incomplete Information

It is shown how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium, and is demonstrated in the domain of poker, showing it can solve abstractions of limit Texas Hold'em with as many as 1012 states, two orders of magnitude larger than previous methods.

Monte Carlo Sampling for Regret Minimization in Extensive Games

A general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) is described, of which the original and poker-specific versions are special cases.

Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization

This work presents new sampling techniques that consider sets of chance outcomes during each traversal to produce slower, more accurate iterations of Counterfactual Regret Minimization, and demonstrates that this new CFR update converges more quickly than chance-sampled CFR in the large domains of poker and Bluff.

Iterated Regret Minimization: A New Solution Concept

Finding Optimal Abstract Strategies in Extensive-Form Games

This work presents for the first time an algorithm which efficiently finds optimal abstract strategies --- strategies with minimal exploitability in the unabstracted game and uses this technique to find the least exploitable strategy ever reported for two-player limit Texas hold'em.

Monte carlo sampling and regret minimization for equilibrium computation and decision-making in large extensive form games

This thesis investigates the problem of decision-making in large two-player zero-sum games using Monte Carlo sampling and regret minimization methods and develops a theory for applying counterfactual regrets minimization to a generic subset of imperfect recall games.

Effective short-term opponent exploitation in simplified poker

This work explores two approaches to opponent modelling in the context of Kuhn poker, a small game for which game-theoretic solutions are known and some are preferable because they speed learning of the opponent’s strategy by exploring it more effectively.

Abstraction pathologies in extensive games

This paper shows that the standard approach to finding strong strategies for large extensive games rests on shaky ground, and shows that pathologies arise when abstracting both chance nodes as well as a player's actions.

A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation

It is demonstrated that the game theory-based heads-up Texas Hold'em poker player, GS1, which incorporates very little poker-specific knowledge, is competitive with leading poker-playing programs which incorporate extensive domain knowledge, as well as with advanced human players.