Sparse Latent Space Policy Search

@inproceedings{Luck2016SparseLS,
  title={Sparse Latent Space Policy Search},
  author={Kevin Sebastian Luck and Joni Pajarinen and Erik Berger and Ville Kyrki and Heni Ben Amor},
  booktitle={AAAI},
  year={2016}
}
Computational agents often need to learn policies that involve many control variables, e.g., a robot needs to control several joints simultaneously. Learning a policy with a high number of parameters, however, usually requires a large number of training samples. We introduce a reinforcement learning method for sample-efficient policy search that exploits correlations between control variables. Such correlations are particularly frequent in motor skill learning tasks. The introduced method… 

Figures from this paper

Information Maximizing Exploration with a Latent Dynamics Model
TLDR
This work presents an approach that uses a model to derive reward bonuses as a means of intrinsic motivation to improve model-free reinforcement learning and is both theoretically grounded and computationally advantageous, permitting the efficient use of Bayesian information-theoretic methods in high-dimensional state spaces.
Multimodal Policy Search using Overlapping Mixtures of Sparse Gaussian Process Prior
TLDR
A novel policy search reinforcement learning algorithm that can deal with multimodality in control policies based on Gaussian processes by placing the OMSGPs as the prior of the multimodal control policy.
Latent Space Reinforcement Learning for Steering Angle Prediction
TLDR
This work addresses the problem of learning driving policies for an autonomous agent in a high-fidelity simulator with a modular deep reinforcement learning approach to predict the steering angle of the car from raw images.
Motor Synergy Development in High-Performing Deep Reinforcement Learning Algorithms
TLDR
This is the first attempt to quantify the synergy development in detail and evaluate its emergence process during deep learning motor control tasks and it is demonstrated that there is a correlation between the synergy-related metrics and the performance and energy efficiency of a trained agent.
Extracting bimanual synergies with reinforcement learning
  • K. Luck, H. B. Amor
  • Computer Science
    2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2017
TLDR
It is discussed how synergies can be learned through latent space policy search and an extension of the algorithm for the re-use of previously learned synergies for exploration is introduced and introduced.
Sample-Efficient Reinforcement Learning for Robot to Human Handover Tasks
TLDR
This work used the Sparse Latent Space Policy Search algorithm and a linear-Gaussian trajectory approximator with the objective of learning optimized, understandable trajectories for object handovers between a robot and a human with very high sample-efficiency.
xtracting Bimanual Synergies with Reinforcement Learning
TLDR
It is discussed how synergies can be learned through latent space policy search and an extension of the algorithm for the re-use of previously learned synergies for exploration is introduced and introduced.
From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion
TLDR
The findings of this study show that static policies developed in the laboratory do not translate to effective locomotion strategies in natural environments, and sample-efficient reinforcement learning can help to rapidly accommodate changes in the environment or the robot.
Bi-manual Learning for a Basketball Playing Robot
TLDR
A dual armed robot has been built and taught to handle the ball and make the basket successfully thus demonstrating the capability of using both arms.
Reinforced Wasserstein Training for Severity-Aware Semantic Segmentation in Autonomous Driving
TLDR
A Wasserstein training framework is developed to explore the inter-class correlation by defining its ground metric as misclassification severity, and an adaptively learning scheme of the ground matrix is proposed to utilize the high-fidelity CARLA simulator.
...
...

References

SHOWING 1-10 OF 24 REFERENCES
Latent space policy search for robotics
TLDR
This paper presents a novel policy search method that performs efficient reinforcement learning by uncovering the low-dimensional latent space of actuator redundancies by performing dimensionality reduction as a preprocessing step but naturally combines it with policy search.
Using dimensionality reduction to exploit constraints in reinforcement learning
TLDR
This work presents an alternative way to incorporate prior knowledge from demonstrations of individual postures into learning, by extracting the inherent problem structure to find an efficient state representation in the learnt latent space.
Policy search for motor primitives in robotics
TLDR
A novel EM-inspired algorithm for policy learning that is particularly well-suited for dynamical system motor primitives is introduced and applied in the context of motor learning and can learn a complex Ball-in-a-Cup task on a real Barrett WAM™ robot arm.
Variational Inference for Policy Search in changing situations
TLDR
Variational Inference for Policy Search (VIP) has several interesting properties and meets the performance of state-of-the-art methods while being applicable to simultaneously learning in multiple situations.
Learning omnidirectional path following using dimensionality reduction
TLDR
A method is proposed that uses a (possibly inaccurate) simulator to identify a low-dimensional subspace of policies that spans the variations in model dynamics that can be learned on the real system using much less data than would be required to learn a policy in the original class.
Natural Actor-Critic
Towards Motor Skill Learning for Robotics
TLDR
This paper proposes to break the generic skill learning problem into parts that the authors can understand well from a robotics point of view, and designs appropriate learning approaches for these basic components, which will serve as the ingredients of a general approach to motor skill learning.
Probabilistic inference for solving discrete and continuous state Markov Decision Processes
TLDR
An Expectation Maximization algorithm for computing optimal policies that actually optimizes the discounted expected future return for arbitrary reward functions and without assuming an ad hoc finite total time is presented.
Robot trajectory optimization using approximate inference
TLDR
This work considers a probabilistic model for which the maximum likelihood (ML) trajectory coincides with the optimal trajectory and which reproduces the classical SOC solution and utilizes approximate inference methods that efficiently generalize to non-LQG systems.
Black-Box Policy Search with Probabilistic Programs
TLDR
This work relates classic policy gradient techniques to recently introduced black-box variational methods which generalize to probabilistic program inference and presents case studies in the Canadian traveler problem, Rock Sample, and a benchmark for optimal diagnosis inspired by Guess Who.
...
...