On Training Flexible Robots using Deep Reinforcement Learning

  title={On Training Flexible Robots using Deep Reinforcement Learning},
  author={Zach Dwiel and Madhavun Candadai and Mariano Phielipp},
  journal={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
The use of robotics in controlled environments has flourished over the last several decades and training robots to perform tasks using control strategies developed from dynamical models of their hardware have proven very effective. However, in many real-world settings, the uncertainties of the environment, the safety requirements and generalized capabilities that are expected of robots make rigid industrial robots unsuitable. This created great research interest in developing control strategies… 
Training an Under-actuated Gripper for Grasping Shallow Objects Using Reinforcement Learning
This paper presents an approach for training an under-actuated gripper and the robot attached to it for grasping shallow objects and showed reduction in both the grasping time and the number of movements.


Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
Automated Deep Reinforcement Learning Environment for Hardware of a Modular Legged Robot
An automated learning environment for developing control policies directly on the hardware of a modular legged robot facilitates the reinforcement learning process by computing the rewards using a vision-based tracking system and relocating the robot to the initial position using a resetting mechanism.
Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning
It is demonstrated how a low-cost off-the-shelf robotic system can learn closed-loop policies for a stacking task in only a handful of trials-from scratch.
Learning force control policies for compliant manipulation
This work presents an approach to acquiring manipulation skills on compliant robots through reinforcement learning, and uses the Policy Improvement with Path Integrals (PI2) algorithm to learn these force/torque profiles by optimizing a cost function that measures task success.
Reinforcement learning in robotics: A survey
This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Continuous control with deep reinforcement learning
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Skillful control under uncertainty via direct reinforcement learning
It is argued that for learning tasks arising frequently in control applications, the most useful methods in practice probably are those the authors call direct associative reinforcement learning methods, which are described and illustrated with an example the utility of these methods for learning skilled robot control under uncertainty.
End-to-End Training of Deep Visuomotor Policies
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
A Survey on Policy Search for Robotics
This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms.
Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot
It is demonstrated that an appropriate feedback controller can be acquired within a few thousand trials by numerical simulations and the controller obtained in numerical simulation achieves stable walking with a physical robot in the real world.