# What game are we playing? End-to-end learning in normal and extensive form games

@inproceedings{Ling2018WhatGA, title={What game are we playing? End-to-end learning in normal and extensive form games}, author={Chun Kai Ling and Fei Fang and J. Zico Kolter}, booktitle={IJCAI}, year={2018} }

Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents. This paper deals with the relatively under-explored but equally important "inverse" setting, where the parameters of the underlying game are not known to all agents, but must be learned through observations. We propose a differentiable, end-to-end learning framework for addressing…

## 55 Citations

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

- Computer Science, EconomicsAAAI
- 2019

This paper draws upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning and proposes an efficient first-order primal-dual method which exploits the structure of extensive-form games.

Learning Probably Approximately Correct Maximin Strategies in Simulation-Based Games with Infinite Strategy Spaces

- Computer ScienceAAMAS
- 2020

This work designs two algorithms with theoretical guarantees to learn maximin strategies in two-player zero-sum games with infinite strategy spaces, and formally proveselta-PAC theoretical guarantees for these algorithms under some regularity assumptions.

Structure Learning for Approximate Solution of Many-Player Games

- Computer ScienceAAAI
- 2020

This work introduces an iterative structure-learning approach to search for approximate solutions of many-player games, assuming only black-box simulation access to noisy payoff samples and uses supervised learning (regression) to fit payoff values to the learned structures, in compact representations that facilitate equilibrium calculation.

Learn to Predict Equilibria via Fixed Point Networks

- EconomicsArXiv
- 2021

This work introduces Nash Fixed Point Networks (N-FPNs), a class of implicit-depth neural networks that output Nash equilibria of contextual games that fuses data-driven modeling with provided constraints and exploits a novel constraint decoupling scheme to avoid costly projections.

End-to-End Learning and Intervention in Games

- EconomicsNeurIPS
- 2020

This paper cast the equilibria of games as individual layers and integrate them into an end-to-end optimization framework and proposes two approaches, respectively based on explicit and implicit differentiation based on the solutions to variational inequalities.

Towards the PAC Learnability of Nash Equilibrium

- Computer Science
- 2021

It is proved that NE is agnostic PAC learnable if the hypothesis class for Nash predictors has bounded covering numbers, which justifies the feasibility of approximating NE through purely data-driven approaches, which benefits both game theorists and machine learning practitioners.

Integrating Learning with Game Theory for Societal Challenges

- Computer ScienceIJCAI
- 2019

This paper introduces the work on integrating learning with computational game theory for addressing societal challenges such as security and sustainability and discusses machine learning and reinforcement learning's role in this field.

Nash equilibria in human sensorimotor interactions explained by Q-Learning

- EconomicsbioRxiv
- 2021

This work compares different reinforcement learning models based on haptic feedback to human behavior in sensorimotor versions of three classic games, including the Prisoners’ Dilemma, and the symmetric and asymmetric matching pennies games and finds that Q-learning with intrinsic costs that disfavor deviations from average behavior explains the observed data best.

Nash equilibria in human sensorimotor interactions explained by Q-learning with intrinsic costs

- EconomicsScientific reports
- 2021

This work compares different reinforcement learning models to human behavior engaged in sensorimotor interactions with haptic feedback based on three classic games, including the prisoner's dilemma, and the symmetric and asymmetric matching pennies games, and finds that Q-learning with intrinsic costs that disfavor deviations from average behavior explains the observed data best.

Multiagent trajectory models via game theory and implicit layer-based learning

- Computer ScienceArXiv
- 2020

An end-to-end trainable architecture is proposed that hybridizes neural nets with game-theoretic reasoning, has interpretable intermediate representations, and transfers to robust downstream decision making, accompanied by a new class of continuous potential games.

## References

SHOWING 1-10 OF 39 REFERENCES

Learning payoff functions in infinite games

- EconomicsMachine Learning
- 2007

This work considers a class of games with real-valued strategies and payoff information available only in the form of data from a given sample of strategy profiles, and addresses payoff-function learning as a standard regression problem, with provision for capturing known structure in the multiagent environment.

Learning in Games via Reinforcement and Regularization

- EconomicsMath. Oper. Res.
- 2016

This paper extends several properties of exponential learning, including the elimination of dominated strategies, the asymptotic stability of strict Nash equilibria, and the convergence of time-averaged trajectories in zero-sum games with an interior Nash equilibrium.

Learning equilibria of games via payoff queries

- EconomicsEC '13
- 2013

This work studies a corresponding computational learning model, and the query complexity of learning equilibria for various classes of games, and has the stronger result that an equilibrium can be identified while only learning a small fraction of the cost values.

Theoretical and Practical Advances on Smoothing for Extensive-Form Games

- Computer ScienceEC
- 2017

A new weighting scheme for the dilated entropy function is introduced and it is shown that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFRP, in practice.

Computational Rationalization: The Inverse Equilibrium Problem

- Computer ScienceICML
- 2011

Employing the game-theoretic notion of regret and the principle of maximum entropy, this work introduces a technique for predicting and generalizing behavior in competitive and cooperative multi-agent domains.

Gradient methods for stackelberg security games

- Computer ScienceUAI 2016
- 2016

This paper presents a new gradient-based approach for solving large Stackelberg games in security settings, and demonstrates that it can have negligible regret against the leader's true equilibrium strategy, while scaling to large games.

An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning

- Computer Science
- 2000

This paper contributes a comprehensive presentation of the relevant techniques for solving stochastic games from both the game theory community and reinforcement learning communities, and examines the assumptions and limitations of these algorithms.

Quantal Response Equilibria for Extensive Form Games

- Economics
- 1998

This article investigates the use of standard econometric models for quantal choice to study equilibria of extensive form games. Players make choices based on a quantal-choice model and assume other…

DeepStack: Expert-level artificial intelligence in heads-up no-limit poker

- Computer ScienceScience
- 2017

DeepStack is introduced, an algorithm for imperfect-information settings that combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning.

Heads-up limit hold’em poker is solved

- Computer ScienceScience
- 2015

It is announced that heads-up limit Texas hold’em is now essentially weakly solved, and this computation formally proves the common wisdom that the dealer in the game holds a substantial advantage.