• Corpus ID: 240288385

RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem

  title={RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem},
  author={Eric Liang and Zhanghao Wu and Michael Luo and Sven Mika and Joseph E. Gonzalez and Ion Stoica},
Researchers and practitioners in the field of reinforcement learning (RL) frequently leverage parallel computation, which has led to a plethora of new algorithms and systems in the last few years. In this paper, we re-examine the challenges posed by distributed RL and try to view it through the lens of an old idea: distributed dataflow. We show that viewing RL as a dataflow problem leads to highly composable and performant implementations. We propose RLlib Flow, a hybrid actor-dataflow… 
Deep Reinforcement Learning: Opportunities and Challenges
In this article, a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI is given, and a discussion is attempted, attempting to answer: “Why has RL not been widely adopted in practice yet?” and “When is RL helpful?’.
Reinforcement Learning in Practice: Opportunities and Challenges
This article is a gentle discussion about the field of reinforcement learning in practice, about opportunities and challenges, touching a broad range of topics, with perspectives and without technical
Distributed Training of Knowledge Graph Embedding Models using Ray
This work uses Ray to build an end-to-end system for data preprocessing and distributed training of graph neural network based knowledge graph embedding models, and applies it to link prediction task, i.e. using knowledge graphs embedding to discover links between nodes in graphs.


Acme: A Research Framework for Distributed Reinforcement Learning
It is shown that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation.
RLlib: Abstractions for Distributed Reinforcement Learning
This work argues for distributing RL components in a composable way by adapting algorithms for top-down hierarchical control, thereby encapsulating parallelism and resource requirements within short-running compute tasks, through RLlib: a library that provides scalable software primitives for RL.
IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
A new distributed reinforcement learning algorithm, IMPACT, which extends PPO with three changes: a target network for stabilizing the surrogate objective, a circular buffer, and truncated importance sampling, and trains faster than existing scalable agents while preserving the sample efficiency of synchronous PPO.
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
RLgraph is introduced, a library for designing and executing reinforcement learning tasks in both static graph and define-by-run paradigms, and its implementations are robust, incrementally testable, and yield high performance across different deep learning frameworks and distributed backends.
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
The design choices behind SLM Lab are presented and it is used to produce a comprehensive single-codebase RL algorithm benchmark and a discrete-action variant of the Soft Actor-Critic algorithm and a hybrid synchronous/asynchronous training method for RL agents are introduced.
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic (SAC), the recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework, achieves state-of-the-art performance, outperforming prior on-policy and off- policy methods in sample-efficiency and asymptotic performance.
Distributed Prioritized Experience Replay
This work proposes a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible, and substantially improves the state of the art on the Arcade Learning Environment.
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
It is shown that the scene understanding and navigation policies learned can be transferred to other navigation tasks -- the analog of "ImageNet pre-training + task-specific fine-tuning" for embodied AI.
Ray: A Distributed Framework for Emerging AI Applications
This paper proposes an architecture that logically centralizes the system's control state using a sharded storage system and a novel bottom-up distributed scheduler that speeds up challenging benchmarks and serves as both a natural and performant fit for an emerging class of reinforcement learning applications and algorithms.