What game are we playing? End-to-end learning in normal and extensive form games

@inproceedings{Ling2018WhatGA,
  title={What game are we playing? End-to-end learning in normal and extensive form games},
  author={Chun Kai Ling and Fei Fang and J. Zico Kolter},
  booktitle={IJCAI},
  year={2018}
}
Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents.  This paper deals with the relatively under-explored but equally important "inverse" setting, where the parameters of the underlying game are not known to all agents, but must be learned through observations.  We propose a differentiable, end-to-end learning framework for addressing… 

Figures from this paper

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games
TLDR
This paper draws upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning and proposes an efficient first-order primal-dual method which exploits the structure of extensive-form games.
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
TLDR
This work shows that a single algorithm—a simple extension to mirror descent with proximal regularization that is called magnetic mirror descent (MMD)—can produce strong results in both settings, despite their fundamental differences, and proves that MMD converges linearly to QREs in extensive-form games.
Learning Probably Approximately Correct Maximin Strategies in Simulation-Based Games with Infinite Strategy Spaces
TLDR
This work designs two algorithms with theoretical guarantees to learn maximin strategies in two-player zero-sum games with infinite strategy spaces, and formally proveselta-PAC theoretical guarantees for these algorithms under some regularity assumptions.
Structure Learning for Approximate Solution of Many-Player Games
TLDR
This work introduces an iterative structure-learning approach to search for approximate solutions of many-player games, assuming only black-box simulation access to noisy payoff samples and uses supervised learning (regression) to fit payoff values to the learned structures, in compact representations that facilitate equilibrium calculation.
Learn to Predict Equilibria via Fixed Point Networks
TLDR
This work introduces Nash Fixed Point Networks (N-FPNs), a class of implicit-depth neural networks that output Nash equilibria of contextual games that fuses data-driven modeling with provided constraints and exploits a novel constraint decoupling scheme to avoid costly projections.
End-to-End Learning and Intervention in Games
TLDR
This paper cast the equilibria of games as individual layers and integrate them into an end-to-end optimization framework and proposes two approaches, respectively based on explicit and implicit differentiation based on the solutions to variational inequalities.
Towards the PAC Learnability of Nash Equilibrium
TLDR
It is proved that NE is agnostic PAC learnable if the hypothesis class for Nash predictors has bounded covering numbers, which justifies the feasibility of approximating NE through purely data-driven approaches, which benefits both game theorists and machine learning practitioners.
Integrating Learning with Game Theory for Societal Challenges
TLDR
This paper introduces the work on integrating learning with computational game theory for addressing societal challenges such as security and sustainability and discusses machine learning and reinforcement learning's role in this field.
Nash equilibria in human sensorimotor interactions explained by Q-Learning
TLDR
This work compares different reinforcement learning models based on haptic feedback to human behavior in sensorimotor versions of three classic games, including the Prisoners’ Dilemma, and the symmetric and asymmetric matching pennies games and finds that Q-learning with intrinsic costs that disfavor deviations from average behavior explains the observed data best.
Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games
TLDR
An efficient algorithm is given that achieves O(T) regret with high probability for that setting, even when the agent faces an adversarial environment, and significantly outperforms the prior algorithms for the problem.
...
...

References

SHOWING 1-10 OF 39 REFERENCES
Learning payoff functions in infinite games
TLDR
This work considers a class of games with real-valued strategies and payoff information available only in the form of data from a given sample of strategy profiles, and addresses payoff-function learning as a standard regression problem, with provision for capturing known structure in the multiagent environment.
Learning in Games via Reinforcement and Regularization
TLDR
This paper extends several properties of exponential learning, including the elimination of dominated strategies, the asymptotic stability of strict Nash equilibria, and the convergence of time-averaged trajectories in zero-sum games with an interior Nash equilibrium.
Learning equilibria of games via payoff queries
TLDR
This work studies a corresponding computational learning model, and the query complexity of learning equilibria for various classes of games, and has the stronger result that an equilibrium can be identified while only learning a small fraction of the cost values.
Theoretical and Practical Advances on Smoothing for Extensive-Form Games
TLDR
A new weighting scheme for the dilated entropy function is introduced and it is shown that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFRP, in practice.
Computational Rationalization: The Inverse Equilibrium Problem
TLDR
Employing the game-theoretic notion of regret and the principle of maximum entropy, this work introduces a technique for predicting and generalizing behavior in competitive and cooperative multi-agent domains.
Gradient methods for stackelberg security games
TLDR
This paper presents a new gradient-based approach for solving large Stackelberg games in security settings, and demonstrates that it can have negligible regret against the leader's true equilibrium strategy, while scaling to large games.
An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning
TLDR
This paper contributes a comprehensive presentation of the relevant techniques for solving stochastic games from both the game theory community and reinforcement learning communities, and examines the assumptions and limitations of these algorithms.
Quantal Response Equilibria for Extensive Form Games
This article investigates the use of standard econometric models for quantal choice to study equilibria of extensive form games. Players make choices based on a quantal-choice model and assume other
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker
TLDR
DeepStack is introduced, an algorithm for imperfect-information settings that combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning.
Heads-up limit hold’em poker is solved
TLDR
It is announced that heads-up limit Texas hold’em is now essentially weakly solved, and this computation formally proves the common wisdom that the dealer in the game holds a substantial advantage.
...
...