• Corpus ID: 18476493

Inverse Reinforcement Learning with PI 2

  title={Inverse Reinforcement Learning with PI 2},
  author={Mrinal Kalakrishnan and Evangelos A. Theodorou and Stefan Schaal},
We present an algorithm that recovers an unknown cost function from expert-demonstrated trajectories in continuous space. We assume that the cost function is a weighted linear combination of features, and we are able to learn weights that result in a cost function under which the expert demonstrated trajectories are optimal. Unlike previous approaches [1], [2], our algorithm does not require repeated solving of the forward problem (i.e., finding optimal trajectories under a candidate cost… 

Figures from this paper

Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals
  • N. Aghasadeghi, T. Bretl
  • Mathematics, Computer Science
    2011 IEEE/RSJ International Conference on Intelligent Robots and Systems
  • 2011
In this paper, we consider the problem of inverse reinforcement learning for a particular class of continuous-time stochastic systems with continuous state and action spaces, under the assumption
Learning objective functions for manipulation
An approach to learning objective functions for robotic manipulation based on inverse reinforcement learning that can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories is presented.
A review of inverse reinforcement learning theory and recent advances
  • Zhifei Shao, M. Er
  • Computer Science
    IEEE Congress on Evolutionary Computation
  • 2012
Inverse Reinforcement Learning (IRL), an extension of RL, introduces a new way of learning policies by deriving expert’s intentions, in contrast to directly learning policies, which can be redundant and have poor generalization ability.
A survey of inverse reinforcement learning techniques
The original IRL algorithms and its close variants, as well as their recent advances are reviewed and compared.
Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise
This paper develops a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise, and introduces a novel latent variable characterizing the reliability of each expert action and uses Laplace distribution as its prior.
CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem
Experimental results on standard benchmarks such as objectworld and pendulum show that the proposed algorithm can effectively learn the latent reward function in complex, high-dimensional environments.
Reward function design and exploration time are arguably the biggest obstacles to the deployment of reinforcement learning (RL) agents in the real world. In many real-world tasks, designing a
Learning How Pedestrians Navigate: A Deep Inverse Reinforcement Learning Approach
  • M. Fahad, Zhuo Chen, Yi Guo
  • Computer Science
    2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2018
The evaluation results show that the proposed method has acceptable prediction accuracy compared to other state-of-the-art methods, and it can generate pedestrian trajectories similar to real human trajectories with natural social navigation behaviors such as collision avoidance, leader-follower, and split-and-rejoin.
Reinforcement Learning: Recent Threads
Detailed comparisons and discussion of six reinforcement algorithms, their exploration and exploitation strategy, their weakness and strengths are presented in the paper.


Reinforcement learning of motor skills in high dimensions: A path integral approach
This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.
Apprenticeship learning via inverse reinforcement learning
This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function.
Learning to search: Functional gradient techniques for imitation learning
The work presented extends the Maximum Margin Planning (MMP) framework to admit learning of more powerful, non-linear cost functions, and demonstrates practical real-world performance with three applied case-studies including legged locomotion, grasp planning, and autonomous outdoor unstructured navigation.
Learning Attractor Landscapes for Learning Motor Primitives
By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system.
Reinforcement learning of motor skills with policy gradients
This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.