• Corpus ID: 222291715

Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning

@article{Naderializadeh2020GraphCV,
  title={Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning},
  author={Navid Naderializadeh and Fan Hung and Sean Soleyman and Deepak Khosla},
  journal={ArXiv},
  year={2020},
  volume={abs/2010.04740}
}
We propose a novel framework for value function factorization in multi-agent deep reinforcement learning using graph neural networks (GNNs). In particular, we consider the team of agents as the set of nodes of a complete directed graph, whose edge weights are governed by an attention mechanism. Building upon this underlying graph, we introduce a mixing GNN module, which is responsible for two tasks: i) factorizing the team state-action value function into individual per-agent observation-action… 

Figures and Tables from this paper

QGNN: Value Function Factorisation with Graph Neural Networks
TLDR
The results show that QGNN outperforms state-of-the-art value factorisation baselines consistently, and introduces a permutation invariant mixer which is able to match the performance of other methods, even with significantly fewer parameters.
Multi-agent Reinforcement Learning for Dynamic Resource Management in 6G in-X Subnetworks
TLDR
A novel effective intelligent radio resource management method using multi-agent deep reinforcement learning (MARL), which only needs the sum of received power, named received signal strength indicator (RSSI), on each channel instead of channel gains, and which outperforms both traditional and MARL-based methods in various aspects.
Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning
TLDR
Experimental results present that HGCN-MIX matches or surpasses state-of-the-art techniques in the StarCraft II multi-agent challenge (SMAC) benchmark on various situations, notably those with a number of agents.
Cooperative Multi-Agent Reinforcement Learning with Hypergraph Convolution
TLDR
This paper proposes HyperGraph CoNvo- lution MIX (HGCN-MIX), a method that incorporates hypergraph convolution with value decomposition with the aim of enhancing the coordination among different agents in MAS.
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning
TLDR
This work proposes a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Centralized Training Decentralized Execution paradigm and provides a direct reward decomposition method for finding these local rewards when only a global signal is provided.
Neuro-DCF: Design of Wireless MAC via Multi-Agent Reinforcement Learning Approach
TLDR
An algorithm which adopts an experience-driven approach and train CSMA-based wireless MAC by using deep reinforcement learning is proposed, which significantly outperforms 802.11 DCF and O-DCF, a recent theory-based MAC protocol, especially in terms of improving delay performance while preserving optimal utility.
Multi-agent autonomous battle management using deep neuroevolution
TLDR
The deep neuroevolution approach out-performs human-programmed AI opponents with a win rate greater than 80% in multi-agent Beyond Visual Range air engagement simulations developed using AFSIM.

References

SHOWING 1-10 OF 40 REFERENCES
Value-Decomposition Networks For Cooperative Multi-Agent Learning
TLDR
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.
Deep Implicit Coordination Graphs for Multi-agent Reinforcement Learning
TLDR
It is demonstrated that DICG solves the relative overgeneralization pathology in predatory-prey tasks as well as outperforms various MARL baselines on the challenging StarCraft II Multi-agent Challenge (SMAC) and traffic junction environments.
Graph Convolutional Reinforcement Learning
TLDR
Graph convolutional reinforcement learning is proposed, where graph convolution adapts to the dynamics of the underlying graph of the multi-agent environment, and relation kernels capture the interplay between agents by their relation representations.
PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning
TLDR
This work proposes a 'permutation invariant critic' (PIC), which yields identical output irrespective of the agent permutation, which enables the model to scale to 30 times more agents and to achieve improvements of test episode reward between 15% to 50% on the challenging multi-agent particle environment (MPE).
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
TLDR
A new factorization method for MARL, QTRAN, is proposed, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions.
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TLDR
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
TLDR
This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
TLDR
QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.
The StarCraft Multi-Agent Challenge
TLDR
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms
TLDR
This work evaluates and compares three different classes of MARL algorithms in a diverse range of multi-agent learning tasks and shows that algorithm performance depends strongly on environment properties and no algorithm learns efficiently across all learning tasks.
...
...