# Inverse Reinforcement Learning with PI 2

@inproceedings{Kalakrishnan2010InverseRL, title={Inverse Reinforcement Learning with PI 2}, author={Mrinal Kalakrishnan and Evangelos A. Theodorou and Stefan Schaal}, year={2010} }

We present an algorithm that recovers an unknown cost function from expert-demonstrated trajectories in continuous space. We assume that the cost function is a weighted linear combination of features, and we are able to learn weights that result in a cost function under which the expert demonstrated trajectories are optimal. Unlike previous approaches [1], [2], our algorithm does not require repeated solving of the forward problem (i.e., finding optimal trajectories under a candidate cost…

## Figures from this paper

## 9 Citations

Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals

- Mathematics, Computer Science2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
- 2011

In this paper, we consider the problem of inverse reinforcement learning for a particular class of continuous-time stochastic systems with continuous state and action spaces, under the assumption…

Learning objective functions for manipulation

- Mathematics, Computer Science2013 IEEE International Conference on Robotics and Automation
- 2013

An approach to learning objective functions for robotic manipulation based on inverse reinforcement learning that can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories is presented.

A review of inverse reinforcement learning theory and recent advances

- Computer ScienceIEEE Congress on Evolutionary Computation
- 2012

Inverse Reinforcement Learning (IRL), an extension of RL, introduces a new way of learning policies by deriving expert’s intentions, in contrast to directly learning policies, which can be redundant and have poor generalization ability.

A survey of inverse reinforcement learning techniques

- Computer ScienceInt. J. Intell. Comput. Cybern.
- 2012

The original IRL algorithms and its close variants, as well as their recent advances are reviewed and compared.

Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise

- Computer ScienceAAAI
- 2014

This paper develops a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise, and introduces a novel latent variable characterizing the reliability of each expert action and uses Laplace distribution as its prior.

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

- Computer Science, MathematicsArXiv
- 2019

Experimental results on standard benchmarks such as objectworld and pendulum show that the proposed algorithm can effectively learn the latent reward function in complex, high-dimensional environments.

EWARDS FOR I MITATION L EARNING

- 2017

Reward function design and exploration time are arguably the biggest obstacles to the deployment of reinforcement learning (RL) agents in the real world. In many real-world tasks, designing a…

Learning How Pedestrians Navigate: A Deep Inverse Reinforcement Learning Approach

- Computer Science2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2018

The evaluation results show that the proposed method has acceptable prediction accuracy compared to other state-of-the-art methods, and it can generate pedestrian trajectories similar to real human trajectories with natural social navigation behaviors such as collision avoidance, leader-follower, and split-and-rejoin.

Reinforcement Learning: Recent Threads

- Computer Science
- 2020

Detailed comparisons and discussion of six reinforcement algorithms, their exploration and exploitation strategy, their weakness and strengths are presented in the paper.

## References

SHOWING 1-5 OF 5 REFERENCES

Reinforcement learning of motor skills in high dimensions: A path integral approach

- Computer Science2010 IEEE International Conference on Robotics and Automation
- 2010

This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

Apprenticeship learning via inverse reinforcement learning

- Computer ScienceICML
- 2004

This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function.

Learning to search: Functional gradient techniques for imitation learning

- Computer ScienceAuton. Robots
- 2009

The work presented extends the Maximum Margin Planning (MMP) framework to admit learning of more powerful, non-linear cost functions, and demonstrates practical real-world performance with three applied case-studies including legged locomotion, grasp planning, and autonomous outdoor unstructured navigation.

Learning Attractor Landscapes for Learning Motor Primitives

- Computer Science, MathematicsNIPS
- 2002

By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system.

Reinforcement learning of motor skills with policy gradients

- Computer Science, MedicineNeural Networks
- 2008

This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.