Corpus ID: 219687994

Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms

  title={Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms},
  author={Georgios Papoudakis and Filippos Christianos and Lukas Sch{\"a}fer and Stefano V. Albrecht},
Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we evaluate and compare three different classes of MARL algorithms (independent learners, centralised training with decentralised execution, and value decomposition) in a diverse range of multi-agent learning tasks. Our results show that (1) algorithm performance depends strongly on environment properties and no… Expand
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients
This paper introduces semi-onpolicy (SOP) training as an effective and computationally efficient way to address the sample inefficiency of on-policy policy gradient methods and enhances two state-of-the-art policy gradient algorithms with SOP training, demonstrating significant performance improvements. Expand
DSDF: An approach to handle stochastic agents in collaborative multi-agent reinforcement learning
DSDF which tunes the discounted factor for the agents according to uncertainty and use the values to update the utility networks of individual agents to enable joint co-ordinations of agents some of which may be partially performing and thereby can reduce or delay the investment of agent/robot replacement in many circumstances. Expand
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms on a Building Energy Demand Coordination Task
An empirical comparison of three classes of MA-RL algorithms: independent learners, centralized critics with decentralized execution, and value factorization learners is contributed to evaluate these algorithms on an energy coordination task in CityLearn, an Open AI Gym environment. Expand
Structured Diversification Emergence via Reinforced Organization Control and Hierarchical Consensus Learning
Comparative experiments on four large-scale cooperation tasks show that Rochico is significantly better than the current SOTA algorithms in terms of exploration efficiency and cooperation strength. Expand
Local Information Agent Modelling in Partially-Observable Environments
This paper provides a comprehensive evaluation and ablations studies in cooperative, competitive and mixed multi-agent environments, showing that the method achieves significantly higher returns than baseline methods which do not use the learned representations. Expand
Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing
This work proposes a novel method to automatically identify agents which may beneft from sharing parameters by partitioning them based on their abilities and goals, which combines the increased sample effciency of parameter sharing with the representational capacity of multiple independent networks to reduce training time and increase fnal returns. Expand
The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games
This work investigates Multi-Agent PPO, a multi-agent PPO variant which adopts a centralized value function and finds that compared to off-policy baselines, MAPPO achieves better or comparable sample complexity as well as substantially faster running time. Expand
An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective
This work provides a self-contained assessment of the current state-of-the-art MARL techniques from a game theoretical perspective and expects this work to serve as a stepping stone for both new researchers who are about to enter this fast-growing domain and existing domain experts who want to obtain a panoramic view and identify new directions based on recent advances. Expand
Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning
We propose a novel framework for value function factorization in multi-agent deep reinforcement learning using graph neural networks (GNNs). In particular, we consider the team of agents as the setExpand


Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented. Expand
Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning
This paper surveys recent works that address the non-stationarity problem in multi-agent deep reinforcement learning, and methods range from modifications in the training procedure, to learning representations of the opponent's policy, meta-learning, communication, and decentralized learning. Expand
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches. Expand
The StarCraft Multi-Agent Challenge
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened. Expand
Value-Decomposition Networks For Cooperative Multi-Agent Learning
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. Expand
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
By embracing deep neural networks, this work is able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. Expand
Benchmarking Deep Reinforcement Learning for Continuous Control
This work presents a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, task with partial observations, and tasks with hierarchical structure. Expand
Opponent Modeling in Deep Reinforcement Learning
Inspired by the recent success of deep reinforcement learning, this work presents neural-based models that jointly learn a policy and the behavior of opponents, and uses a Mixture-of-Experts architecture to encode observation of the opponents into a deep Q-Network. Expand
LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning
This paper proposes to merge the two directions of MARL and learn each agent an intrinsic reward function which diversely stimulates the agents at each time step, and compares LIIR with a number of state-of-the-art MARL methods on battle games in StarCraft II. Expand
Modeling Others using Oneself in Multi-Agent Reinforcement Learning
Self Other-Modeling is proposed, in which an agent uses its own policy to predict the other agent's actions and update its belief of their hidden state in an online manner, which is able to learn better policies using their estimate of the other players' hidden states in both cooperative and adversarial settings. Expand