• Corpus ID: 18476493

Inverse Reinforcement Learning with PI 2

@inproceedings{Kalakrishnan2010InverseRL,
  title={Inverse Reinforcement Learning with PI 2},
  author={Mrinal Kalakrishnan and Evangelos A. Theodorou and Stefan Schaal},
  year={2010}
}
We present an algorithm that recovers an unknown cost function from expert-demonstrated trajectories in continuous space. We assume that the cost function is a weighted linear combination of features, and we are able to learn weights that result in a cost function under which the expert demonstrated trajectories are optimal. Unlike previous approaches [1], [2], our algorithm does not require repeated solving of the forward problem (i.e., finding optimal trajectories under a candidate cost… 

Figures from this paper

Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals

  • N. AghasadeghiT. Bretl
  • Mathematics, Computer Science
    2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
  • 2011
In this paper, we consider the problem of inverse reinforcement learning for a particular class of continuous-time stochastic systems with continuous state and action spaces, under the assumption

A survey of inverse reinforcement learning techniques

The original IRL algorithms and its close variants, as well as their recent advances are reviewed and compared.

Unsupervised Perceptual Rewards for Imitation Learning

This work presents a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps.

EWARDS FOR I MITATION L EARNING

This work presents a method that is able to identify the key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps.

Reinforcement Learning: Recent Threads

Detailed comparisons and discussion of six reinforcement algorithms, their exploration and exploitation strategy, their weakness and strengths are presented in the paper.

Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise

This paper develops a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise, and introduces a novel latent variable characterizing the reliability of each expert action and uses Laplace distribution as its prior.

A review of inverse reinforcement learning theory and recent advances

Inverse Reinforcement Learning (IRL), an extension of RL, introduces a new way of learning policies by deriving expert’s intentions, in contrast to directly learning policies, which can be redundant and have poor generalization ability.

References

SHOWING 1-5 OF 5 REFERENCES

Reinforcement learning of motor skills in high dimensions: A path integral approach

This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

Apprenticeship learning via inverse reinforcement learning

This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function.

Learning to search: Functional gradient techniques for imitation learning

The work presented extends the Maximum Margin Planning (MMP) framework to admit learning of more powerful, non-linear cost functions, and demonstrates practical real-world performance with three applied case-studies including legged locomotion, grasp planning, and autonomous outdoor unstructured navigation.

Learning Attractor Landscapes for Learning Motor Primitives

By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system.

Reinforcement learning of motor skills with policy gradients