Latent space policy search for robotics

  title={Latent space policy search for robotics},
  author={Kevin Sebastian Luck and Gerhard Neumann and Erik Berger and Jan Peters and Heni Ben Amor},
  journal={2014 IEEE/RSJ International Conference on Intelligent Robots and Systems},
  • K. Luck, G. Neumann, H. B. Amor
  • Published 6 November 2014
  • Computer Science
  • 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems
Learning motor skills for robots is a hard task. In particular, a high number of degrees-of-freedom in the robot can pose serious challenges to existing reinforcement learning methods, since it leads to a high-dimensional search space. However, complex robots are often intrinsically redundant systems and, therefore, can be controlled using a latent manifold of much smaller dimensionality. In this paper, we present a novel policy search method that performs efficient reinforcement learning by… 

Figures from this paper

Sparse Latent Space Policy Search
A reinforcement learning method for sample-efficient policy search that exploits correlations between control variables, particularly frequent in motor skill learning tasks, and outperforms state-of-the-art policy search methods.
Contextual Policy Search for Micro-Data Robot Motion Learning through Covariate Gaussian Process Latent Variable Models
In the next few years, the amount and variety of context-aware robotic manipulator applications is expected to increase significantly, especially in household environments. In such spaces, thanks to
Sample-Efficient Robot Motion Learning using Gaussian Process Latent Variable Models
A surrogate model for the reward function is built, that maps an MP parameter latent space (obtained through a Mutual-information-weighted Gaussian Process Latent Variable Model) into a reward, while the task dynamics are not model, to make the policy improvement faster.
User Feedback in Latent Space Robotic Skill Learning
In order to operate in everyday human environment, humanoids robots will need to autonomously learn and adapt their actions, using among other reinforcement learning methods (RL). A common challenge
Dimensionality Reduction and Prioritized Exploration for Policy Search
A novel method to prioritize the exploration of effective parameters and cope with full covariance matrix updates is presented, which learns faster than recent approaches and requires fewer samples to achieve state-of-the-art results.
Bimanual robot skills: MP encoding, dimensionality reduction and reinforcement learning
This thesis addresses inverse kinematics for redundant robot manipulators, i.e: positioning the robot joints so as to reach a certain end-effector pose, and opts for iterative solutions based on the inversion of the kinematic Jacobian of a robot, and proposes to filter and limit the gains in the spectral domain.
Dimensionality reduction for probabilistic movement primitives
This paper uses probablistic dimensionality reduction techniques based on expectation maximization to extract the unknown synergies from a given set of demonstrations and shows that thedimensionality reduction is more efficient both for encoding a trajectory from data and for applying Reinforcement Learning with Relative Entropy Policy Search (REPS).
Tensor Based Knowledge Transfer Across Skill Categories for Robot Control
A class of neural network controllers that can realise four distinct skill classes: reaching, object throwing, casting, and ball-in-cup are introduced and factorising the weights of the neural network is able to extract transferrable latent skills, that enable dramatic acceleration of learning in cross-task transfer.
Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient
This work evaluates the use of model-based trajectory optimization methods used for exploration in Deep Deterministic Policy Gradient when trained on a latent image embedding, leading to a symbiotic relationship between the deep reinforcement learning algorithm and the latent trajectory optimizer.
Extracting bimanual synergies with reinforcement learning
  • K. Luck, H. B. Amor
  • Computer Science
    2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2017
It is discussed how synergies can be learned through latent space policy search and an extension of the algorithm for the re-use of previously learned synergies for exploration is introduced and introduced.


Using dimensionality reduction to exploit constraints in reinforcement learning
This work presents an alternative way to incorporate prior knowledge from demonstrations of individual postures into learning, by extracting the inherent problem structure to find an efficient state representation in the learnt latent space.
Policy search for motor primitives in robotics
A novel EM-inspired algorithm for policy learning that is particularly well-suited for dynamical system motor primitives is introduced and applied in the context of motor learning and can learn a complex Ball-in-a-Cup task on a real Barrett WAM™ robot arm.
Reinforcement learning of motor skills with policy gradients
A Survey on Policy Search for Robotics
This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms.
Learning concurrent motor skills in versatile solution spaces
This paper presents a complete framework that is capable of learning different solution strategies for a real robot Tetherball task, and simultaneously learns multiple distinct solutions for the same task, such that a partial degeneration of this solution space does not prevent the successful completion of the task.
Reinforcement learning in robotics: A survey
This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Learning omnidirectional path following using dimensionality reduction
A method is proposed that uses a (possibly inaccurate) simulator to identify a low-dimensional subspace of policies that spans the variations in model dynamics that can be learned on the real system using much less data than would be required to learn a policy in the original class.
Dimensional reduction for reward-based learning
It is shown that Hebbian forms of synaptic plasticity applied to synapses between a supervisor circuit and the network it is controlling can effectively reduce the dimension of the space of parameters being searched to support efficient reinforcement-based learning in large networks.
Learning to select and generalize striking movements in robot table tennis
This paper presents a new framework that allows a robot to learn cooperative table tennis from physical interaction with a human and shows that the resulting setup is capable of playing table tennis using an anthropomorphic robot arm.
Reinforcement learning and dimensionality reduction: A model in computational neuroscience
A mechanism for the outstanding reduction of dimensionality from the input to the output of the basal ganglia is studied within a model more realistic from a computational neuroscience point of view, and its feasibility when the loop is closed is shown.