Corpus ID: 219636339

Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning

@article{Christianos2020SharedEA,
  title={Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning},
  author={Filippos Christianos and Lukas Sch{\"a}fer and Stefano V. Albrecht},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.07169}
}
Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents. Our proposed algorithm, called Shared Experience Actor-Critic (SEAC), applies experience sharing in an actor-critic framework. We evaluate SEAC in a collection of sparse-reward multi-agent environments and find that it consistently outperforms two baselines and two state-of-the-art… Expand

Figures and Tables from this paper

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
TLDR
Cooperative multi-agent exploration (CMAE) is proposed, where the goal is selected from multiple projected state spaces via a normalized entropy-based technique and agents are trained to reach this goal in a coordinated manner. Expand
Regularize! Don't Mix: Multi-Agent Reinforcement Learning without Explicit Centralized Structures
TLDR
The proposed Multi-Agent Regularized Qlearning (MARQ) aims to address limitations in the MARL context through applying regularization constraints which can correct bias in off-policy out-of distribution agent experiences and promote diverse exploration. Expand
LIEF: Learning to Influence through Evaluative Feedback
We present a multi-agent reinforcement learning framework where rewards are not only generated by the environment but also by other peers in it through inter-agent evaluative feedback. We show thatExpand
Greedy UnMixing for Q-Learning in Multi-Agent Reinforcement Learning
TLDR
Greedy UnMix aims to avoid scenarios where MARL methods fail due to overestimation of values as part of the large joint state-action space through a conservative Q-learning approach through restricting the state-marginal in the dataset to avoid unobserved joint state action spaces. Expand
LINDA: Multi-Agent Local Information Decomposition for Awareness of Teammates
TLDR
A novel framework, multi-agent Local INformation Decomposition for Awareness of teammates (LINDA), with which agents learn to decompose local information and build awareness for each teammate and significantly improves the learning performance, especially on challenging tasks. Expand
A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation
TLDR
Experimental results for simulation experiments show that the proposed hierarchical Soft Actor-Critic algorithm can make multi-logistics AGV robots work together and improves the reward in sparse environments about 2.61 times compared to the SAC algorithm. Expand
Duplicated Replay Buffer for Asynchronous Deep Deterministic Policy Gradient
TLDR
This research intends to make the transaction selection process more efficient by increasing the likelihood of selecting important transactions from the replay memory buffer by using a secondary replay memorybuffer that stores more critical transactions. Expand
Patch AutoAugment
TLDR
A patch-level automatic DA algorithm called Patch AutoAugment (PAA), which allows each patch DA operation to be controlled by an agent and models it as a Multi-Agent Reinforcement Learning (MARL) problem. Expand
Behavioral model summarisation for other agents under uncertainty
  • Yinghui Pan, Biyang Ma, Jing Tang, Yifeng Zeng
  • Computer Science
  • Information Sciences
  • 2022
TLDR
The research in this article contributes to a new model selection technique for modelling other agents therefore improving decision quality of a subject agent interacting with the other agents and inspires a creative usage of sub-modular function optimisation in multiagent decision making. Expand
Robust control and training risk reduction for boiler level control using two-stage training deep deterministic policy gradient
  • Jia-Lin Kang, S. Mirzaei, Jia-An Zhou
  • Computer Science
  • Journal of the Taiwan Institute of Chemical Engineers
  • 2021
TLDR
2S-DDPG can address the shortcomings of the traditional DDPG model and ensures stable industrial operation due to the lowered risk of process failures in training. Expand
...
1
2
...

References

SHOWING 1-10 OF 48 REFERENCES
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TLDR
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented. Expand
Value-Decomposition Networks For Cooperative Multi-Agent Learning
TLDR
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. Expand
Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning
TLDR
This paper surveys recent works that address the non-stationarity problem in multi-agent deep reinforcement learning, and methods range from modifications in the training procedure, to learning representations of the opponent's policy, meta-learning, communication, and decentralized learning. Expand
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
TLDR
A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation. Expand
MAVEN: Multi-Agent Variational Exploration
TLDR
A novel approach called MAVEN is proposed that hybridises value and policy-based methods by introducing a latent space for hierarchical control, which allows MAVEN to achieve committed, temporally extended exploration, which is key to solving complex multi-agent tasks. Expand
Simultaneously Learning and Advising in Multiagent Reinforcement Learning
TLDR
A multiagent advising framework where multiple agents can advise each other while learning in a shared environment is proposed and it is shown that the learning process is improved by incorporating this kind of advice. Expand
The StarCraft Multi-Agent Challenge
TLDR
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened. Expand
Markov Games as a Framework for Multi-Agent Reinforcement Learning
TLDR
A Q-learning-like algorithm for finding optimal policies and its application to a simple two-player game in which the optimal policy is probabilistic is demonstrated. Expand
Autonomously Reusing Knowledge in Multiagent Reinforcement Learning
TLDR
An important challenge problem for the AI community is defined, the existent methods for knowledge reuse are surveyed, the gaps in the current literature are highlighted, and "low-hanging fruit'' for those interested in the area are highlighted. Expand
Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning
TLDR
EPC is introduced, a curriculum learning paradigm that scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner and uses an evolutionary approach to fix an objective misalignment issue throughout the curriculum. Expand
...
1
2
3
4
5
...