Corpus ID: 237364152

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

  title={WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU},
  author={Tian Lan and Sunil Srinivasa and Stephan Zheng},
Tian Lan * 1 Sunil Srinivasa * 1 Huan Wang 1 Stephan Zheng 1 ABSTRACT Deep reinforcement learning (RL) is a powerful framework to train decision-making models in complex environments. However, RL can be slow as it requires repeated interaction with a simulation of the environment. In particular, there are key system engineering bottlenecks when using RL in complex environments that feature multiple agents with high-dimensional state, observation, or action spaces. We present WarpDrive, a… Expand


SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
A modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL), which is able to train on millions of frames per second and lower the cost of experiments compared to current methods with a simple architecture. Expand
Megaverse: Simulating Embodied Agents at One Million Experiences per Second
The efficient design of the engine enables physics-based simulation with highdimensional egocentric observations at more than 1,000,000 actions per second on a single 8-GPU node, thereby taking full advantage of the massive parallelism of modern GPUs. Expand
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation. Expand
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research, is proposed and it is shown that a network with Leaky ReLU and Batch Normalization coupled with long-horizon training and progressive curriculum beats the rule-based built-in AI more than $70\% of the time in the full game of Mini-RTS. Expand
Acme: A Research Framework for Distributed Reinforcement Learning
It is shown that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster. Expand
Mava: a research framework for distributed multi-agent reinforcement learning
Mava is presented: a research framework specifically designed for building scalable MARL systems and provides useful components, abstractions, utilities and tools for MARL and allows for simple scaling for multi-process system training and execution, while providing a high level of flexibility and composability. Expand
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots. Expand
Grandmaster level in StarCraft II using multi-agent reinforcement learning
The agent, AlphaStar, is evaluated, which uses a multi-agent reinforcement learning algorithm and has reached Grandmaster level, ranking among the top 0.2% of human players for the real-time strategy game StarCraft II. Expand
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
This research presents Isaac Gym, a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU that leads to blazing fast training times for complex robotics tasks on a single GPU with 2-3 orders of magnitude improvements compared to conventional RL training. Expand
Brax - A Differentiable Physics Engine for Large Scale Rigid Body Simulation
We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by theExpand