Adaptive Two-stage Learning Algorithm for Repeated Games

  title={Adaptive Two-stage Learning Algorithm for Repeated Games},
  author={Wataru Fujita and Koichi Moriyama and Ken-ichi Fukui and Masayuki Numao},
In our society, people engage in a variety of interactions. To analyze such interactions, we consider these interactions as a game and people as agents equipped with reinforcement learning algorithms. Reinforcement learning algorithms are widely studied with a goal of identifying strategies of gaining large payoffs in games; however, existing algorithms learn slowly because they require a large number of interactions. In this work, we constructed an algorithm that both learns quickly and… 

Figures and Tables from this paper

Multi-model Adaptive Learning for Robots Under Uncertainty

A novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies and can be used as a coordination mechanism between players when they should take decisions under uncertainty.



The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems

This work distinguishes reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts, and proposes alternative optimistic exploration strategies that increase the likelihood of convergence to an optimal equilibrium.

Learning To Cooperate in a Social Dilemma: A Satisficing Approach to Bargaining

This work modify and analyze a satisficing algorithm based on (Karandikar et al., 1998) that is compatible with the bargaining perspective, and develops an M action, N player social dilemma that encodes the key elements of the Prisoner's Dilemma.

Nash Q-Learning for General-Sum Stochastic Games

This work extends Q-learning to a noncooperative multiagent context, using the framework of general-sum stochastic games, and implements an online version of Nash Q- learning that balances exploration with exploitation, yielding improved performance.

Effective learning in the presence of adaptive counterparts

Ensemble Algorithms in Reinforcement Learning

  • M. WieringH. V. Hasselt
  • Computer Science
    IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
  • 2008
Several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent to enhance learning speed and final performance by combining the chosen actions or action probabilities of different RL algorithms are described.

Technical Note: Q-Learning

A convergence theorem is presented and proves that Q -learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.

On-line Q-learning using connectionist systems

Simulations show that on-line learning algorithms are less sensitive to the choice of training parameters than backward replay, and that the alternative update rules of MCQ-L and Q( ) are more robust than standard Q-learning updates.

Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning

It is proved that M-Qubed’s average payoffs meet or exceed its maximin value in the limit, and it is demonstrated that an agent can learn to make good compromises and hence receive high payoffs, in repeated games by effectively encoding and balancing best-response, cautious, and optimistic learning biases.