• Publications
  • Influence
Mean Field Multi-Agent Reinforcement Learning
Existing multi-agent reinforcement learning methods are limited typically to a small number of agents. When the agent number increases largely, the learning becomes intractable due to the curse of
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games
TLDR
This paper introduces a Multiagent Bidirectionally-Coordinated Network (BiCNet) with a vectorised extension of actor-critic formulation and demonstrates that without any supervisions such as human demonstrations or labelled data, BiCNet could learn various types of advanced coordination strategies that have been commonly used by experienced game players.
Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games
TLDR
This analysis demonstrates that without any supervisions such as human demonstrations or labelled data, BiCNet could learn various types of coordination strategies that is similar to these of experienced game players, and is easily adaptable to the tasks with heterogeneous agents.
Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning
TLDR
Extensive experiments demonstrate that the theoretically derive a general formula of Q_{tot} in terms of $Q^{i}$, based on which a multi-head attention formation to approximate $Q_{Tot}$ can naturally implement, resulting in not only a refined representation of $Tot$ with an agent-level attention mechanism, but also a tractable maximization algorithm of decentralized policies.
Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning
TLDR
Under the PR2 framework, decentralized-training-decentralized-execution algorithms are developed that are proved to converge in the self-play scenario when there is one Nash equilibrium and experiments show that it is critical to reason about how the opponents believe about what the agent believes.
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning
TLDR
This paper addresses the order dispatching problem using multi-agent reinforcement learning (MARL), which follows the distributed nature of the peer-to-peer ridesharing problem and possesses the ability to capture the stochastic demand-supply dynamics in large-scale ridesh sharing scenarios.
Large-Scale Home Energy Management Using Entropy-Based Collective Multiagent Reinforcement Learning Framework
TLDR
This paper focuses on a microgrid in which a large-scale modern homes interact together to optimize their electricity cost, and presents an Entropy-Based Collective Multiagent Deep Reinforcement Learning (EB-C-MADRL) framework to address it.
SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving
TLDR
The design goals of SMARTS (Scalable Multi-Agent RL Training School) are described, its basic architecture and its key features are explained, and its use is illustrated through concrete multi-agent experiments on interactive scenarios.
Learning in Nonzero-Sum Stochastic Games with Potentials
TLDR
This paper introduces a new generation of MARL learners that can handle nonzero-sum payoff structures and continuous settings and proves theoretically the learning method, SPot-AC, enables independent agents to learn Nash equilibrium strategies in polynomial time.
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
TLDR
Results show that HATRPO and HAPPO significantly outperform strong baselines such as IPPO, MAPPO and MADDPG on all tested tasks, thereby establishing a new state of the art in multi-agent MARL.
...
...