Decentralised Learning in Systems with Many, Many Strategic Agents

@article{Mguni2018DecentralisedLI,
  title={Decentralised Learning in Systems with Many, Many Strategic Agents},
  author={David Henry Mguni and Joel Jennings and Enrique Munoz de Cote},
  journal={ArXiv},
  year={2018},
  volume={abs/1803.05028}
}
Although multi-agent reinforcement learning can tackle systems of strategically interacting entities, it currently fails in scalability and lacks rigorous convergence guarantees. Crucially, learning in multi-agent systems can become intractable due to the explosion in the size of the state-action space as the number of agents increases. In this paper, we propose a method for computing closed-loop optimal policies in multi-agent systems that scales independently of the number of agents. This… 

Figures from this paper

On the Convergence of Model Free Learning in Mean Field Games
TLDR
This paper analyzes in full generality the convergence of a fictitious iterative scheme using any single agent learning algorithm at each step of the Mean Field MAS, and shows for the first time convergence of model free learning algorithms towards non-stationary MFG equilibria.
Decentralized Mean Field Games
TLDR
This paper proposes a new mean field system known as Decentralized Mean Field Games, where each agent can be quite different from others, and defines a theoretical solution concept for this system and provides a fixed point guarantee for a Q-learning based algorithm in this system.
Partially Observable Mean Field Reinforcement Learning
TLDR
This paper introduces a Q-learning based algorithm that can learn effectively in large environments with many agents learning simultaneously to achieve possibly distinct goals and proves that this Qlearning estimate stays very close to the Nash Q-value for the first setting.
Model Free Reinforcement Learning Algorithm for Stationary Mean field Equilibrium for Multiple Types of Agents
TLDR
Numerically, how multi-agent Markov strategic interaction over an infinite horizon can model the cyber attacks among defenders and adversaries is evaluated, and how RL based algorithm can converge to an equilibrium is shown.
Modelling the Dynamics of Multiagent Q-Learning in Repeated Symmetric Games: a Mean Field Theoretic Approach
TLDR
This paper studies an n-agent setting with n tends to infinity, such that agents learn their policies concurrently over repeated symmetric bimatrix games with some other agents, and derives a Fokker-Planck equation that describes the evolution of the probability distribution of Q-values in the agent population.
Reinforcement Learning in Stationary Mean-field Games
TLDR
This paper studies reinforcement learning in a specific class of multi-agent systems systems called mean-field games, and presents two reinforcement learning algorithms that converge to the right solution under mild technical conditions.
Learning in Nonzero-Sum Stochastic Games with Potentials
TLDR
This paper introduces a new generation of MARL learners that can handle nonzero-sum payoff structures and continuous settings and proves theoretically the learning method, SPot-AC, enables independent agents to learn Nash equilibrium strategies in polynomial time.
The Evolutionary Dynamics of Independent Learning Agents in Population Games
TLDR
A novel unified framework for characterising population dynamics via a single partial differential equation (Theorem 1) is provided and extensive experimental results validating that Theorem 1 holds for a variety of learning methods and population games are presented.
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
TLDR
To learn a Nash equilibrium of an MPG in which the size of state space and/or the number of players can be very large, new independent policy gradient algorithms are proposed that are run by all players in tandem.
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
TLDR
This work proves that their algorithm converges to the Nash equilibrium at a linear rate, which is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees.
...
...

References

SHOWING 1-10 OF 31 REFERENCES
Nash Q-Learning for General-Sum Stochastic Games
TLDR
This work extends Q-learning to a noncooperative multiagent context, using the framework of general-sum stochastic games, and implements an online version of Nash Q- learning that balances exploration with exploitation, yielding improved performance.
Markov Games as a Framework for Multi-Agent Reinforcement Learning
A Comprehensive Survey of Multiagent Reinforcement Learning
TLDR
The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.
Multiagent learning in large anonymous games
TLDR
It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if best-reply dynamics converge, and two features are identified that improve convergence.
Potential-based difference rewards for multiagent reinforcement learning
TLDR
Two novel reward functions that combine potential-based reward shaping to a wide range of multiagent systems without the need for domain specific knowledge are introduced, maintaining the theoretical guarantee of consistent Nash equilibria.
Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations
This exciting and pioneering new overview of multiagent systems, which are online systems composed of multiple interacting intelligent agents, i.e., online trading, offers a newly seen computer
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
TLDR
This work analyzes the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network on two Markov games and characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance.
Reinforcement Learning: An Introduction
TLDR
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Discrete Time, Finite State Space Mean Field Games
In this paper we report on some recent results for mean field models in discrete time with a finite number of states. These models arise in situations that involve a very large number of agents
Explicit solutions of some linear-quadratic mean field games
  • M. Bardi
  • Mathematics
    Networks Heterog. Media
  • 2012
TLDR
The quadratic-Gaussian solution to a system of two differential equations of the kind introduced by Lasry and Lions in the theory of Mean Field Games is solved and the L-Q model is compared with other Mean Field models of population distribution.
...
...