HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism
@article{Xu2021HAVENHC, title={HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism}, author={Zhiwei Xu and Yunpeng Bai and Bin Zhang and Dapeng Li and Guoliang Fan}, journal={ArXiv}, year={2021}, volume={abs/2110.07246} }
Multi-agent reinforcement learning often suffers from the exponentially large action space caused by a large number of agents. This paper proposes a novel value decomposition framework HAVEN based on hierarchical reinforcement learning for the fully cooperative multi-agent problems. To address the instability that arises from the concurrent optimization of high-level and low-level policies and another concurrent optimization of agents, we introduce the dual coordination mechanism of inter-layer…
Figures and Tables from this paper
2 Citations
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
- Computer ScienceArXiv
- 2022
Under this method, agents can interact with the learned virtual environment and evaluate the current state value according to imagined future states, which makes agents have foresight under any multiagent value decomposition method.
Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
- Computer ScienceArXiv
- 2022
A set of asynchronous multi-agent actor-critic methods that allow agents to directly optimize asynchronous policies in three decentralized learning, centralized learning, and centralized training for decentralized execution paradigms are formulated.
References
SHOWING 1-10 OF 43 REFERENCES
Value-Decomposition Networks For Cooperative Multi-Agent Learning
- Computer ScienceAAMAS
- 2018
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
- Computer ScienceNIPS
- 2017
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction
- Computer Science
- 2018
Three hierarchical deep MARL architectures are proposed to learn hierarchical policies under different MARL paradigms and a new experience replay mechanism is proposed to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
- Computer ScienceICML
- 2019
This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.
Inter-Level Cooperation in Hierarchical Reinforcement Learning
- Computer ScienceArXiv
- 2019
It is hypothesized that improved cooperation between the internal agents of a hierarchy can simplify the credit assignment problem from the perspective of the high-level policies, thereby leading to significant improvements to training in situations where intricate sets of action primitives must be performed to yield improvements in performance.
Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning
- Computer ScienceICLR 2019
- 2019
It is found that, given an adequate set of subgoals from which to choose, FMH performs, and particularly scales, substantially better than cooperative approaches that use a shared reward function.
MAVEN: Multi-Agent Variational Exploration
- Computer ScienceNeurIPS
- 2019
A novel approach called MAVEN is proposed that hybridises value and policy-based methods by introducing a latent space for hierarchical control, which allows MAVEN to achieve committed, temporally extended exploration, which is key to solving complex multi-agent tasks.
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
- Computer ScienceICML
- 2019
A new factorization method for MARL, QTRAN, is proposed, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions.
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery
- Computer ScienceAAMAS
- 2020
A two-level hierarchical multi-agent reinforcement learning (MARL) algorithm with unsupervised skill discovery that enables the emergence of useful skills and cooperative team play and the interpretability of the learned skills show the promise of the proposed method for achieving human-AI cooperation in team sports games.
QPLEX: Duplex Dueling Multi-Agent Q-Learning
- Computer ScienceICLR
- 2021
A novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function and encodes the IGM principle into the neural network architecture and thus enables efficient value function learning.