Corpus ID: 11663660

Data-Efficient Generalization of Robot Skills with Contextual Policy Search

@inproceedings{Kupcsik2013DataEfficientGO,
  title={Data-Efficient Generalization of Robot Skills with Contextual Policy Search},
  author={A. Kupcsik and M. Deisenroth and Jan Peters and G. Neumann},
  booktitle={AAAI},
  year={2013}
}
In robotics, controllers make the robot solve a task within a specific context. The context can describe the objectives of the robot or physical properties of the environment and is always specified before task execution. To generalize the controller to multiple contexts, we follow a hierarchical approach for policy learning: A lower-level policy controls the robot for a given context and an upper-level policy generalizes among contexts. Current approaches for learning such upper-level policies… Expand
Model-based contextual policy search for data-efficient generalization of robot skills
TLDR
A novel model-based contextual policy search algorithm that is able to generalize lower-level controllers, and is data-efficient is proposed, based on learned probabilistic forward models and information theoretic policy search. Expand
Factored Contextual Policy Search with Bayesian optimization
TLDR
This paper applies factorization to a Bayesian optimization approach to contextual policy search both in sampling-based and active learning settings, and shows faster learning and better generalization in various robotic domains. Expand
A Survey on Policy Search for Robotics
TLDR
This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms. Expand
Accounting for Task-Difficulty in Active Multi-Task Robot Control Learning
TLDR
This work proposes the novel approach PUBSVE for estimating a reward baseline and investigates empirically on benchmark problems and simulated robotic tasks to which extent this method can remedy the issue of non-comparable reward. Expand
Learning Replanning Policies With Direct Policy Search
TLDR
This work proposes a framework to learn trajectory replanning policies via contextual policy search and demonstrates that they are safe for the robot, can be learned efficiently, and outperform non-replanning policies for problems with partially observable or perturbed context. Expand
Active contextual policy search
TLDR
It is argued that there is a better way than selecting each task equally often because some tasks might be easier to learn at the beginning and the knowledge that the agent can extract from these tasks can be transferred to similar but more difficult tasks. Expand
Hierarchical Relative Entropy Policy Search
TLDR
This work defines the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-Policies for execution by the agent and treats them as latent variables which allows for distribution of the update information between the sub- policies. Expand
Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation
TLDR
A novel model-based RL method is proposed by combining a recently proposed model-free policy search method called policy gradients with parameter-based exploration and the state-of-the-art transition model estimator called least-squares conditional density estimation. Expand
Contextual Policy Search for Generalizing a Parameterized Biped Walking Controller
TLDR
The desired flexibility of the controller is achieved by applying the recently developed contextual relative entropy policy search(REPS) method, which can generalize the robot walking controller for different contexts, where a context is described by a real valued vector. Expand
Gaussian Processes for Data-Efficient Learning in Robotics and Control
TLDR
This paper learns a probabilistic, non-parametric Gaussian process transition model of the system and applies it to autonomous learning in real robot and control tasks, achieving an unprecedented speed of learning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Hierarchical Relative Entropy Policy Search
TLDR
This work defines the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-Policies for execution by the agent and treats them as latent variables which allows for distribution of the update information between the sub- policies. Expand
Policy Search for Motor Primitives in Robotics
TLDR
This paper extends previous work on policy learning from the immediate reward case to episodic reinforcement learning, resulting in a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning that is particularly well-suited for dynamic motor primitives. Expand
Autonomous helicopter control using reinforcement learning policy search methods
  • J. Bagnell, J. Schneider
  • Engineering, Computer Science
  • Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164)
  • 2001
TLDR
This work considers algorithms that evaluate and synthesize controllers under distributions of Markovian models and demonstrates the presented learning control algorithm by flying an autonomous helicopter and shows that the controller learned is robust and delivers good performance in this real-world domain. Expand
Using inaccurate models in reinforcement learning
TLDR
This paper presents a hybrid algorithm that requires only an approximate model, and only a small number of real-life trials, and achieves near-optimal performance in the real system, even when the model is only approximate. Expand
Reinforcement learning of motor skills in high dimensions: A path integral approach
TLDR
This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics. Expand
PILCO: A Model-Based and Data-Efficient Approach to Policy Search
TLDR
PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way by learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning. Expand
Reinforcement Learning to Adjust Robot Movements to New Situations
TLDR
This paper describes how to learn such mappings from circumstances to meta-parameters using reinforcement learning, and uses a kernelized version of the reward-weighted regression to do so. Expand
Learning Attractor Landscapes for Learning Motor Primitives
TLDR
By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system. Expand
Variational Inference for Policy Search in changing situations
TLDR
Variational Inference for Policy Search (VIP) has several interesting properties and meets the performance of state-of-the-art methods while being applicable to simultaneously learning in multiple situations. Expand
Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning
TLDR
It is demonstrated how a low-cost off-the-shelf robotic system can learn closed-loop policies for a stacking task in only a handful of trials-from scratch. Expand
...
1
2
3
...