Reinforcement learning of motor skills with policy gradients

  title={Reinforcement learning of motor skills with policy gradients},
  author={Jan Peters and Stefan Schaal},
  journal={Neural networks : the official journal of the International Neural Network Society},
  volume={21 4},
  • Jan PetersS. Schaal
  • Published 1 May 2008
  • Computer Science
  • Neural networks : the official journal of the International Neural Network Society

Figures and Tables from this paper

Reinforcement Learning for Motor Primitives

This diploma thesis implements the framework of motor primitives based on dynamical systems, adapt it for applicability to the authors' task and subsequently discusses how the suggested learning framework works in toy applications.

Reinforcement learning of motor skills using Policy Search and human corrective advice

The results show that the proposed method not only converges to higher rewards when learning movement primitives, but also that the learning is sped up by a factor of 4–40 times, depending on the task.

Towards Motor Skill Learning for Robotics

This paper proposes to break the generic skill learning problem into parts that the authors can understand well from a robotics point of view, and designs appropriate learning approaches for these basic components, which will serve as the ingredients of a general approach to motor skill learning.

Socially guided intrinsic motivation for robot learning of motor skills

It is illustrated that SGIM-D efficiently combines the advantages of social learning and intrinsic motivation and benefits from human demonstration properties to learn how to produce varied outcomes in the environment, while developing more precise control policies in large spaces.

Reinforcement learning of motor skills in high dimensions: A path integral approach

This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

Learning motor skills: from algorithms to robot experiments

  • J. Kober
  • Computer Science
    it Inf. Technol.
  • 2012
It is shown how motor primitives can be employed to learn motor skills on three different levels, which contributes to the state of the art in reinforcement learning applied to robotics both in terms of novel algorithms and applications.

Robot Skill Learning

The generic skill learning problem is proposed to be divided into parts that can be well-understood from a robotics point of view, and appropriate learning approaches for these basic components will serve as the ingredients of a general approach to robot skill learning.

Deep Reinforcement Learning for Robotic Manipulation

It is demonstrated that a recent deep reinforcement learning algorithm based on offpolicy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.

Learning Motor Skills - From Algorithms to Robot Experiments

This book illustrates a method that learns to generalize parameterized motor plans which is obtained by imitation or reinforcement learning, by adapting a small set of global parameters and appropriate kernel-based reinforcement learning algorithms.



Machine Learning for motor skills in robotics

This work investigates the ingredients for a general approach to motor skill learning and study two major components for such an approach, i.e., a theoretically well-founded general approach for representing the required control structures for task representation and execution and appropriate learning algorithms which can be applied in this setting.

Reinforcement Learning for Humanoid Robotics

This paper discusses different approaches of reinforcement learning in terms of their applicability in humanoid robotics, and demonstrates that ‘vanilla’ policy gradient methods can be significantly improved using the natural policy gradient instead of the regular policy gradient.

Learning Attractor Landscapes for Learning Motor Primitives

By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system.

Policy Gradient Methods for Robotics

  • Jan PetersS. Schaal
  • Computer Science
    2006 IEEE/RSJ International Conference on Intelligent Robots and Systems
  • 2006
An overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field is given and how the most recently developed methods can significantly improve learning performance is shown.

Learning by Demonstration

  • S. Schaal
  • Education, Computer Science
    Encyclopedia of Machine Learning and Data Mining
  • 1996
In an implementation of pole balancing on a complex anthropomorphic robot arm, it is demonstrated that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems.

Learning Movement Primitives

A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria, and demonstrates the different ingredients of the DMP approach in various examples.

Reinforcement learning for imitating constrained reaching movements

A system to teach the robot constrained reaching tasks is described based on a dynamic system generator modulated by a learned speed trajectory combined with a reinforcement learning module to allow the robot to adapt the trajectory when facing a new situation, e.g., in the presence of obstacles.

Efficient Gradient Estimation for Motor Control Learning

Two techniques for reducing gradient estimation errors in the presence of observable input noise applied to the control signal are presented and significantly improve the response function gradient estimate and, consequently, the learning curve, over existing methods.

Acquiring robot skills via reinforcement learning

A stochastic real-valued (SRV) reinforcement learning algorithm is described and used for learning control and the authors show how it can be used with nonlinear multilayer ANNs.

Neuronlike adaptive elements that can solve difficult learning control problems

It is shown how a system consisting of two neuronlike adaptive elements can solve a difficult learning control problem and the relation of this work to classical and instrumental conditioning in animal learning studies and its possible implications for research in the neurosciences.