• Corpus ID: 222291715

Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning

  title={Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning},
  author={Navid Naderializadeh and Fan Hung and Sean Soleyman and Deepak Khosla},
We propose a novel framework for value function factorization in multi-agent deep reinforcement learning using graph neural networks (GNNs). In particular, we consider the team of agents as the set of nodes of a complete directed graph, whose edge weights are governed by an attention mechanism. Building upon this underlying graph, we introduce a mixing GNN module, which is responsible for two tasks: i) factorizing the team state-action value function into individual per-agent observation-action… 

Figures and Tables from this paper

Multi-agent Reinforcement Learning for Dynamic Resource Management in 6G in-X Subnetworks
A novel effective intelligent radio resource management method using multi-agent deep reinforcement learning (MARL), which only needs the sum of received power, named received signal strength indicator (RSSI), on each channel instead of channel gains, and which outperforms both traditional and MARL-based methods in various aspects.
Cooperative Multi-Agent Reinforcement Learning with Hypergraph Convolution
This paper proposes HyperGraph CoNvo- lution MIX (HGCN-MIX), a method that incorporates hypergraph convolution with value decomposition with the aim of enhancing the coordination among different agents in MAS.
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning
This work proposes a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Centralized Training Decentralized Execution paradigm and provides a direct reward decomposition method for finding these local rewards when only a global signal is provided.
Multi-agent autonomous battle management using deep neuroevolution
The deep neuroevolution approach out-performs human-programmed AI opponents with a win rate greater than 80% in multi-agent Beyond Visual Range air engagement simulations developed using AFSIM.
Neuro-DCF: Design of Wireless MAC via Multi-Agent Reinforcement Learning Approach
An algorithm which adopts an experience-driven approach and train CSMA-based wireless MAC by using deep reinforcement learning is proposed, which significantly outperforms 802.11 DCF and O-DCF, a recent theory-based MAC protocol, especially in terms of improving delay performance while preserving optimal utility.
Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning
Experimental results present that HGCN-MIX matches or surpasses state-of-the-art techniques in the StarCraft II multi-agent challenge (SMAC) benchmark on various situations, notably those with a number of agents.


Value-Decomposition Networks For Cooperative Multi-Agent Learning
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.
PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning
This work proposes a 'permutation invariant critic' (PIC), which yields identical output irrespective of the agent permutation, which enables the model to scale to 30 times more agents and to achieve improvements of test episode reward between 15% to 50% on the challenging multi-agent particle environment (MPE).
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
A new factorization method for MARL, QTRAN, is proposed, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions.
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.
The StarCraft Multi-Agent Challenge
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms
This work evaluates and compares three different classes of MARL algorithms in a diverse range of multi-agent learning tasks and shows that algorithm performance depends strongly on environment properties and no algorithm learns efficiently across all learning tasks.
Relational Deep Reinforcement Learning
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
By embracing deep neural networks, this work is able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability.