• Corpus ID: 219636339

Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning

  title={Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning},
  author={Filippos Christianos and Lukas Sch{\"a}fer and Stefano V. Albrecht},
Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents. Our proposed algorithm, called Shared Experience Actor-Critic (SEAC), applies experience sharing in an actor-critic framework. We evaluate SEAC in a collection of sparse-reward multi-agent environments and find that it consistently outperforms two baselines and two state-of-the-art… 

Figures and Tables from this paper

Collaborative Training of Multiple Autonomous Agents

This work presents two algorithms that aim to be a middle ground between not sharing parameters and fully sharing parameters, and proposes a novel parameter sharing method that can be coupled with existing multi-agent reinforcement learning algorithms.

Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning

We present a novel multi-agent RL approach, Selective Multi-Agent PER , in which agents share with other agents a limited number of transitions they observe during training. They follow a similar

Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks

This work provides a systematic evaluation and comparison of three different classes of MARL algorithms in a diverse range of cooperative multi-agent learning tasks, and opens-source EPyMARL, which extends the PyMARL codebase to include additional algorithms and allow for flexible configuration of algorithm implementation details.

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

Cooperative multi-agent exploration (CMAE) is proposed, where the goal is selected from multiple projected state spaces via a normalized entropy-based technique and agents are trained to reach this goal in a coordinated manner.

Celebrating Diversity in Shared Multi-Agent Reinforcement Learning

This paper proposes an information-theoretical regularization to maximize the mutual information between agents’ identities and their trajectories, encouraging extensive exploration and diverse individualized behaviors in shared multi-agent reinforcement learning.

Task Generalisation in Multi-Agent Reinforcement Learning

The problem of task generalisation is discussed and the difficulty of zero-shot generalisation and finetuning at the example of multi-robot warehouse coordination is demonstrated with preliminary results.

MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

A novel method named MASER is proposed: MARL with subgoals generated from experience replay buffer with significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.

Regularize! Don't Mix: Multi-Agent Reinforcement Learning without Explicit Centralized Structures

The proposed Multi-Agent Regularized Qlearning (MARQ) aims to address limitations in the MARL context through applying regularization constraints which can correct bias in off-policy out-of distribution agent experiences and promote diverse exploration.

Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning

It is suggested that on certain multi-modal problems, policy gradient, which is of-ten considered sample-inefficient as an on-policy method, can be preferable compared to popular value-based learning methods.

Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing

This work proposes a novel method to automatically identify agents which may beneft from sharing parameters by partitioning them based on their abilities and goals, which combines the increased sample effciency of parameter sharing with the representational capacity of multiple independent networks to reduce training time and increase fnal returns.



Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Value-Decomposition Networks For Cooperative Multi-Agent Learning

This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.

Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning

This paper surveys recent works that address the non-stationarity problem in multi-agent deep reinforcement learning, and methods range from modifications in the training procedure, to learning representations of the opponent's policy, meta-learning, communication, and decentralized learning.

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation.

MAVEN: Multi-Agent Variational Exploration

A novel approach called MAVEN is proposed that hybridises value and policy-based methods by introducing a latent space for hierarchical control, which allows MAVEN to achieve committed, temporally extended exploration, which is key to solving complex multi-agent tasks.

Simultaneously Learning and Advising in Multiagent Reinforcement Learning

A multiagent advising framework where multiple agents can advise each other while learning in a shared environment is proposed and it is shown that the learning process is improved by incorporating this kind of advice.

The StarCraft Multi-Agent Challenge

The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.

Markov Games as a Framework for Multi-Agent Reinforcement Learning

Autonomously Reusing Knowledge in Multiagent Reinforcement Learning

An important challenge problem for the AI community is defined, the existent methods for knowledge reuse are surveyed, the gaps in the current literature are highlighted, and "low-hanging fruit'' for those interested in the area are highlighted.

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

EPC is introduced, a curriculum learning paradigm that scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner and uses an evolutionary approach to fix an objective misalignment issue throughout the curriculum.