• Corpus ID: 119260193

Recursive Markov Process for Iterated Games with Markov Strategies

  title={Recursive Markov Process for Iterated Games with Markov Strategies},
  author={Shohei Hidaka},
  journal={arXiv: Probability},
  • S. Hidaka
  • Published 2 September 2015
  • Mathematics
  • arXiv: Probability
The dynamics in games involving multiple players, who adaptively learn from their past experience, is not yet well understood. We analyzed a class of stochastic games with Markov strategies in which players choose their actions probabilistically. This class is formulated as a $k^{\text{th}}$ order Markov process, in which the probability of choice is a function of $k$ past states. With a reasonably large $k$ or with the limit $k \to \infty$, numerical analysis of this random process is… 

Figures from this paper


Stochastic Games*
  • L. Shapley
  • Medicine, Mathematics
    Proceedings of the National Academy of Sciences
  • 1953
In a stochastic game the play proceeds by steps from position to position, according to transition probabilities controlled jointly by the two players, and the expected total gain or loss is bounded by M, which depends on N 2 + N matrices.
Which Types of Learning Make a Simple Game Complex?
The present study focuses on a class of games with reinforcement-learning agents that adaptively choose their actions to locally maximize their rewards, and shows inconsistency between the limit model and two other models with more general reinforcement learning.
Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent
  • W. Press, F. Dyson
  • Psychology, Medicine
    Proceedings of the National Academy of Sciences
  • 2012
It is shown that there exists no simple ultimatum strategy whereby one player can enforce a unilateral claim to an unfair share of rewards, but such strategies unexpectedly do exist.
Behavioral Game Theory
Game theory is a mathematical tool to describe and analyze situations of conflict, cooperation, and coordination. In rational player models it is typically assumed that players are highly rational
Chaos in learning a simple two-person game
That chaos can occur in learning a simple game indicates one should use caution in assuming real people will learn to play a game according to a Nash equilibrium strategy, as it provides an important self-consistency condition for determining when players will learning to behave as though they were fully rational.
Multiagent reinforcement learning in the Iterated Prisoner's Dilemma.
This paper investigates the ability of a variety of Q-learning agents to play the IPD game against an unknown opponent and finds that agents with longer history windows, lookup table memories, and longer exploration schedules fared best in the I PD games.
Learning Through Reinforcement and Replicator Dynamics
A version of Bush and Mosteller's stochastic learning theory in the context of games is considered and it is shown that in the continuous time limit the biological model coincides with the deterministic, continuous time replicator process.
Stochastic strategies in the Prisoner's Dilemma
Abstract A complete analysis of all strategies where the probability to cooperate depends only on the opponent's previous move is given for the infinitely iterated Prisoner's Dilemma. All Nash
Intrinsic noise in game dynamical learning.
  • T. Galla
  • Mathematics, Medicine
    Physical review letters
  • 2009
It is shown that similar noise-sustained trajectories arise in game dynamical learning, where the stochasticity has a different origin: agents sample a finite number of moves of their opponents in between adaptation events, whereas the limit of infinite batches results in deterministic modified replicator equations.
Equilibrium Points in N-Person Games.
  • J. Nash
  • Mathematics, Medicine
    Proceedings of the National Academy of Sciences of the United States of America
  • 1950
One may define a concept of an n -person game in which each player has a finite set of pure strategies and in which a definite set of payments to the n players corresponds to each n -tuple of pure