Learn More
There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. PI is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider PI as a member of the(More)
One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday(More)
In recent years, research on movement primitives has gained increasing popularity. The original goals of movement primitives are based on the desire to have a sufficiently rich and abstract representation for movement generation, which allows for efficient teaching, trial-and-error learning, and generalization of motor skills (Schaal 1999). Thus,motor(More)
One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday(More)
This article describes the computational model underlying the AGILO autonomous robot soccer team, its implementation, and our experiences with it. According to our model the control system of an autonomous soccer robot consists of a probabilistic game state estimator and a situated action selection module. The game state estimator computes the robot’s(More)
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. Second, in manipulation, the end-point of the movement must be chosen carefully, as it represents a grasp which must be adapted to the pose and shape of the object.(More)
Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. There are two main approaches to performing this optimization: reinforcement learning (RL) and black-box optimization (BBO). Whereas BBO algorithms are generic optimization methods that, due to there generality, may also be applied to optimizing policy(More)
One of the long-term challenges of programming by demonstration is achieving generality, i.e. automatically adapting the reproduced behavior to novel situations. A common approach for achieving generality is to learn parameterizable skills from multiple demonstrations for different situations. In this paper, we generalize recent approaches on learning(More)