Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems

@article{Matignon2012IndependentRL,
  title={Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems},
  author={La{\"e}titia Matignon and Guillaume J. Laurent and Nadine Le Fort-Piat},
  journal={The Knowledge Engineering Review},
  year={2012},
  volume={27},
  pages={1 - 31}
}
In the framework of fully cooperative multi-agent systems, independent (non-communicative) agents that learn by reinforcement must overcome several difficulties to manage to coordinate. [...] Key Result Furthermore, the distilled challenges may assist in the design of new learning algorithms that overcome these problems and achieve higher performance in multi-agent applications.Expand
Decentralized reinforcement social learning based on cooperative policy exploration in multi-agent systems
  • Chi Wang, X. Chen
  • Computer Science
  • 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
  • 2017
TLDR
A new algorithm named Decentralized concurrent learning and cooperative policy exploration (DCL-CPE) is contributed, which possesses the ability to overcome the coordination problems and the stochastic rewards via local interaction under the social learning framework. Expand
Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems
TLDR
This work investigates the multiagent coordination problems in cooperative environments under a social learning framework, and distinguishes two different types of learners depending on the amount of information each agent can perceive: individual action learner and joint action learners. Expand
A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
TLDR
The Q-value function is applied to estimate the gradient and derive the probability of maximal reward based on estimated gradient ascent (PMR-EGA) algorithm, which can be naturally extended to optimize cooperative stochastic games. Expand
Reinforcement social learning of coordination in cooperative multiagent systems
TLDR
This paper investigates the multiagent coordination problems in cooperative environments under the social learning framework, in which there exists a large population of agents and each agent interacts with another agent randomly in each round. Expand
Lenient Learning in Independent-Learner Stochastic Cooperative Games
TLDR
The Lenient Multiagent Reinforcement Learning 2 (LMRL2) algorithm for independent-learner stochastic cooperative games is introduced and it is shown that LMRL2 is very effective in both of the authors' measures, and is found in the top rank more often than any other technique. Expand
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
TLDR
An algorithm is described, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection, which generalizes previous ones such as InRL. Expand
Explicitly Coordinated Policy Iteration
TLDR
This work proposes the explicitly coordinated policy iteration (EXCEL) algorithm which always forces agents to coordinate by comparing the agents’ separated optimistic and average value functions and proposes three solutions for deep reinforcement learning extensions of EXCEL. Expand
Reinforcement Social Learning of Coordination in Networked Cooperative Multiagent Systems
TLDR
This work investigates the multiagent coordination problems in cooperative environments under the networked social learning framework focusing on two representative topologies: the small-world and the scale-free network. Expand
Decentralized Q-Learning in Zero-sum Markov Games
TLDR
A radically uncoupled Q-learning dynamics that is both rational and convergent: the learning dynamics converges to the best response to the opponent’s strategy when the opponent follows an asymptotically stationary strategy; the value function estimates converge to the payoffs at a Nash equilibrium when both agents adopt the dynamics. Expand
The Dynamics of Reinforcement Social Learning in Cooperative Multiagent Systems
TLDR
This work investigates the multiagent coordination problems in cooperative environments under the social learning framework by considering a large population of agents where each agent interacts with another agent randomly chosen from the population in each round. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 71 REFERENCES
A study of FMQ heuristic in cooperative multi-agent games
TLDR
A modified version of the FMQ heuristic is proposed which achieves this detection and the update adaptation of the cause of noise and is more robust and very easy to set. Expand
Hysteretic q-learning :an algorithm for decentralized reinforcement learning in cooperative multi-agent teams
TLDR
This article focuses on decentralized reinforcement learning (RL) in cooperative MAS, where a team of independent learning robots (IL) try to coordinate their individual behavior to reach a coherent joint behavior, and suggests a Q-learning extension for ILs, called hysteretic Q- learning. Expand
An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning
TLDR
This paper contributes a comprehensive presentation of the relevant techniques for solving stochastic games from both the game theory community and reinforcement learning communities, and examines the assumptions and limitations of these algorithms. Expand
Nash Q-Learning for General-Sum Stochastic Games
TLDR
This work extends Q-learning to a noncooperative multiagent context, using the framework of general-sum stochastic games, and implements an online version of Nash Q- learning that balances exploration with exploitation, yielding improved performance. Expand
Multi-Agent Reinforcement Learning in Common Interest and Fixed Sum Stochastic Games: An Experimental Study
TLDR
This work is a comprehensive empirical study conducted on MGS, a simulation system developed for this purpose, and demonstrates the strengths and weaknesses of the different approaches to MARL through application of FriendQ, OAL, WoLF, FoeQ, Rmax, and other algorithms, and supplies an informal analysis of the resulting learning processes. Expand
A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics
TLDR
A new MARL algorithm called the Weighted Policy Learner (WPL), which allows agents to reach a Nash Equilibrium (NE) in benchmark 2-player-2-action games with minimum knowledge and outperforms the state-of-the-art algorithms in a more realistic setting of 100 agents interacting and learning concurrently. Expand
The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems
TLDR
This work distinguishes reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts, and proposes alternative optimistic exploration strategies that increase the likelihood of convergence to an optimal equilibrium. Expand
Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration
TLDR
This work derives and studies an idealization of Q-learning in 2-player 2-action repeated general-sum games, and addresses the discontinuous case of e-greedy exploration and uses it as a proxy for value-based algorithms to highlight a contrast with existing results in policy search. Expand
A Comprehensive Survey of Multiagent Reinforcement Learning
TLDR
The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided. Expand
Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents
  • M. Tan
  • Computer Science
  • ICML
  • 1993
TLDR
This paper shows that additional sensation from another agent is beneficial if it can be used efficiently, sharing learned policies or episodes among agents speeds up learning at the cost of communication, and for joint tasks, agents engaging in partnership can significantly outperform independent agents although they may learn slowly in the beginning. Expand
...
1
2
3
4
5
...