Corpus ID: 211296475

Efficient exploration of zero-sum stochastic games

@article{Martn2020EfficientEO,
  title={Efficient exploration of zero-sum stochastic games},
  author={C. Mart{\'i}n and T. Sandholm},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.10524}
}
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay, such as in financial or military simulations and computer games. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. After that, the algorithm has to produce a strategy that has low exploitability. Our motivation is to… Expand

References

SHOWING 1-10 OF 38 REFERENCES
Learning Deviation Payoffs in Simulation-Based Games
  • 7
  • PDF
Nash Q-Learning for General-Sum Stochastic Games
  • 783
  • PDF
Finite-time Analysis of the Multiarmed Bandit Problem
  • 4,456
  • PDF
Deep Q-Learning for Nash Equilibria: Nash-DQN
  • 7
  • PDF
Maximin Action Identification: A New Bandit Framework for Games
  • 24
  • PDF
Computing Equilibria in Multiplayer Stochastic Games of Imperfect Information
  • 43
  • PDF
Bayesian Q-Learning
  • 398
  • PDF
On Nonterminating Stochastic Games
  • 245
Pure Exploration in Multi-armed Bandits Problems
  • 296
  • PDF
...
1
2
3
4
...