• Corpus ID: 226226572

Deep Reactive Planning in Dynamic Environments

  title={Deep Reactive Planning in Dynamic Environments},
  author={Keita Ota and Devesh K. Jha and Tadashi Onishi and Asako Kanezaki and Yusuke Yoshiyasu and Yoko Sasaki and Toshisada Mariyama and Daniel Nikovski},
The main novelty of the proposed approach is that it allows a robot to learn an end-to-end policy which can adapt to changes in the environment during execution. While goal conditioning of policies has been studied in the RL literature, such approaches are not easily extended to cases where the robot's goal can change during execution. This is something that humans are naturally able to do. However, it is difficult for robots to learn such reflexes (i.e., to naturally respond to dynamic… 

Figures and Tables from this paper

MPC-MPNet: Model-Predictive Motion Planning Networks for Fast, Near-Optimal Planning Under Kinodynamic Constraints
This work presents a scalable, imitation learning-based, Model-Predictive Motion Planning Networks framework that quickly finds near-optimal path solutions with worst-case theoretical guarantees under kinodynamic constraints for practical underactuated systems.


Efficient Exploration in Constrained Environments with Goal-Oriented Reference Path
A deep convolutional network is trained that can predict collision-free paths based on a map of the environment– this is used by an reinforcement learning algorithm to learn to closely follow the path, which allows the trained agent to achieve good generalization while learning faster.
Learning Navigation Behaviors End-to-End With AutoRL
Empirical evaluations show that AutoRL policies do not suffer from the catastrophic forgetfulness that plagues many other deep reinforcement learning algorithms, generalize to new environments and moving obstacles, are robust to sensor, actuator, and localization noise, and can serve as robust building blocks for larger navigation tasks.
Trajectory Optimization for Unknown Constrained Systems using Reinforcement Learning
A reinforcement learning-based algorithm for trajectory optimization for constrained dynamical systems, trained with a reference path and parameterize the policies with goal locations, so that the agent can be trained for multiple goals simultaneously.
End-to-End Training of Deep Visuomotor Policies
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real
The RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning, is obtained by incorporating the RL-scene consistency loss into unsupervised domain translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image.
Planning with Goal-Conditioned Policies
This work shows that goal-conditioned policies learned with RL can be incorporated into planning, such that a planner can focus on which states to reach, rather than how those states are reached, and proposes using a latent variable model to compactly represent the set of valid states.
PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning
This work presents PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL), and evaluates it on two navigation tasks with non-trivial robot dynamics.
Real-Time Perception Meets Reactive Motion Generation
This work extensively evaluates the systems on a real robotic platform in four scenarios that exhibit either a challenging workspace geometry or a dynamic environment and quantifies the robustness and accuracy that is due to integrating real-time feedback at different time scales in a reactive motion generation system.
Learning Robotic Assembly from CAD
This work exploits the fact that in modern assembly domains, geometric information about the task is readily available via the CAD design files, and proposes a neural network architecture that can learn to track the motion plan, thereby generalizing the assembly controller to changes in the object positions.
Safe Reinforcement Learning With Model Uncertainty Estimates
MC-Dropout and Bootstrapping are used to give computationally tractable and parallelizable uncertainty estimates and are embedded in a Safe Reinforcement Learning framework to form uncertainty-aware navigation around pedestrians, resulting in a collision avoidance policy that knows what it does not know and cautiously avoids pedestrians that exhibit unseen behavior.