# Inverse Reinforcement Learning with PI 2

@inproceedings{Kalakrishnan2010InverseRL, title={Inverse Reinforcement Learning with PI 2}, author={Mrinal Kalakrishnan and Evangelos A. Theodorou and Stefan Schaal}, year={2010} }

We present an algorithm that recovers an unknown cost function from expert-demonstrated trajectories in continuous space. We assume that the cost function is a weighted linear combination of features, and we are able to learn weights that result in a cost function under which the expert demonstrated trajectories are optimal. Unlike previous approaches [1], [2], our algorithm does not require repeated solving of the forward problem (i.e., finding optimal trajectories under a candidate cost… Expand

#### Figures from this paper

#### 9 Citations

Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals

- Mathematics, Computer Science
- 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
- 2011

In this paper, we consider the problem of inverse reinforcement learning for a particular class of continuous-time stochastic systems with continuous state and action spaces, under the assumption… Expand

Learning objective functions for manipulation

- Mathematics, Computer Science
- 2013 IEEE International Conference on Robotics and Automation
- 2013

An approach to learning objective functions for robotic manipulation based on inverse reinforcement learning that can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories is presented. Expand

A review of inverse reinforcement learning theory and recent advances

- Computer Science
- IEEE Congress on Evolutionary Computation
- 2012

Inverse Reinforcement Learning (IRL), an extension of RL, introduces a new way of learning policies by deriving expert’s intentions, in contrast to directly learning policies, which can be redundant and have poor generalization ability. Expand

A survey of inverse reinforcement learning techniques

- Computer Science
- Int. J. Intell. Comput. Cybern.
- 2012

The original IRL algorithms and its close variants, as well as their recent advances are reviewed and compared. Expand

Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise

- Computer Science
- AAAI
- 2014

This paper develops a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise, and introduces a novel latent variable characterizing the reliability of each expert action and uses Laplace distribution as its prior. Expand

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

- Computer Science, Mathematics
- ArXiv
- 2019

Experimental results on standard benchmarks such as objectworld and pendulum show that the proposed algorithm can effectively learn the latent reward function in complex, high-dimensional environments. Expand

EWARDS FOR I MITATION L EARNING

- 2017

Reward function design and exploration time are arguably the biggest obstacles to the deployment of reinforcement learning (RL) agents in the real world. In many real-world tasks, designing a… Expand

Learning How Pedestrians Navigate: A Deep Inverse Reinforcement Learning Approach

- Computer Science
- 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2018

The evaluation results show that the proposed method has acceptable prediction accuracy compared to other state-of-the-art methods, and it can generate pedestrian trajectories similar to real human trajectories with natural social navigation behaviors such as collision avoidance, leader-follower, and split-and-rejoin. Expand

Reinforcement Learning: Recent Threads

- Computer Science
- 2020

Detailed comparisons and discussion of six reinforcement algorithms, their exploration and exploitation strategy, their weakness and strengths are presented in the paper. Expand

#### References

SHOWING 1-5 OF 5 REFERENCES

Reinforcement learning of motor skills in high dimensions: A path integral approach

- Computer Science
- 2010 IEEE International Conference on Robotics and Automation
- 2010

This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics. Expand

Apprenticeship learning via inverse reinforcement learning

- Computer Science
- ICML
- 2004

This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function. Expand

Learning to search: Functional gradient techniques for imitation learning

- Computer Science
- Auton. Robots
- 2009

The work presented extends the Maximum Margin Planning (MMP) framework to admit learning of more powerful, non-linear cost functions, and demonstrates practical real-world performance with three applied case-studies including legged locomotion, grasp planning, and autonomous outdoor unstructured navigation. Expand

Learning Attractor Landscapes for Learning Motor Primitives

- Computer Science, Mathematics
- NIPS
- 2002

By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system. Expand

Reinforcement learning of motor skills with policy gradients

- Computer Science, Medicine
- Neural Networks
- 2008

This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning. Expand