Coding for Distributed Multi-Agent Reinforcement Learning

  title={Coding for Distributed Multi-Agent Reinforcement Learning},
  author={Baoqian Wang and Junfei Xie and Nikolay A. Atanasov},
  journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
This paper aims to mitigate straggler effects in synchronous distributed learning for multi-agent reinforcement learning (MARL) problems. Stragglers arise frequently in a distributed learning system, due to the existence of various system disturbances such as slow-downs or failures of compute nodes and communication bottlenecks. To resolve this issue, we propose a coded distributed learning framework, which speeds up the training of MARL algorithms in the presence of stragglers, while… 

Figures from this paper

The Holy Grail of Multi-Robot Planning: Learning to Generate Online-Scalable Solutions from Offline-Optimal Experts

This blue-sky paper elaborates some of the key challenges that remain in the use of learning-based methods in multi-robot planning and hopes that by training a policy to copy an optimal pattern generated by a small-scale system, it can transfer that policy to much larger, decentralized systems while maintaining near-optimal performance.



Learning to Communicate with Deep Multi-Agent Reinforcement Learning

By embracing deep neural networks, this work is able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability.

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Two methods using a multi-agent variant of importance sampling to naturally decay obsolete data and conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory enable the successful combination of experience replay with multi- agent RL.

Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems

This paper identifies several challenges responsible for the non-coordination of independent agents: Pareto-selection, non-stationarity, stochasticity, alter-exploration and shadowed equilibria, and can serve as a basis for choosing the appropriate algorithm for a new domain.

Mean Field Multi-Agent Reinforcement Learning

Existing multi-agent reinforcement learning methods are limited typically to a small number of agents. When the agent number increases largely, the learning becomes intractable due to the curse of

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Integrating On-policy Reinforcement Learning with Multi-agent Techniques for Adaptive Service Composition

A new model for large-scale and adaptive service composition based on multi-agent reinforcement learning based on on-policy reinforcement learning and game theory is presented, which is expected to achieve better performance compared with the single- agent reinforcement learning methods in the composition framework.

Massively Parallel Methods for Deep Reinforcement Learning

This work presents the first massively distributed architecture for deep reinforcement learning, using a distributed neural network to represent the value function or behaviour policy, and a distributed store of experience to implement the Deep Q-Network algorithm.

Extending Q-Learning to General Adaptive Multi-Agent Systems

This paper proposes a fundamentally different approach to Q-Learning, dubbed Hyper-Q, in which values of mixed strategies rather than base actions are learned, and in which other agents' strategies are estimated from observed actions via Bayesian inference.

Continuous control with deep reinforcement learning

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

Asynchronous Methods for Deep Reinforcement Learning

A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.