• Corpus ID: 6148715

An Invitation to Imitation

@inproceedings{Bagnell2015AnIT,
  title={An Invitation to Imitation},
  author={J. Andrew Bagnell},
  year={2015}
}
Abstract : Imitation learning is the study of algorithms that attempt to improve performance by mimicking a teacher's decisions and behaviors. Such techniques promise to enable effective programming by demonstration to automate tasks, such as driving, that people can demonstrate but find difficult to hand program. This work represents a summary from a very personal perspective of research on computationally effective methods for learning to imitate behavior. I intend it to serve two audiences… 

Figures from this paper

An Algorithmic Perspective on Imitation Learning
TLDR
This work provides an introduction to imitation learning, dividing imitation learning into directly replicating desired behavior and learning the hidden objectives of the desired behavior from demonstrations (called inverse optimal control or inverse reinforcement learning [Russell, 1998]).
A Bayesian Approach to Generative Adversarial Imitation Learning
TLDR
This work proposes a Bayesian formulation of generative adversarial imitation learning (GAIL), where the imitation policy and the cost function are represented as stochastic neural networks and shows that it can significantly enhance the sample efficiency of GAIL leveraging the predictive density of the cost.
Survey of imitation learning for robotic manipulation
TLDR
The survey of imitation learning of robotic manipulation involves three aspects that are demonstration, representation and learning algorithms and highlights areas of future research potential.
Model-Free Imitation Learning with Policy Optimization
TLDR
Under the apprenticeship learning formalism, this work develops alternative model-free algorithms for finding a parameterized stochastic policy that performs at least as well as an expert policy on an unknown cost function, based on sample trajectories from the expert.
Relational Mimic for Visual Adversarial Imitation Learning
TLDR
A new neural network architecture is introduced that improves upon the previous state-of-the-art in reinforcement learning and it is illustrated how increasing the relational reasoning capabilities of the agent enables the latter to achieve increasingly higher performance in a challenging locomotion task with pixel inputs.
Sequence Model Imitation Learning with Unobserved Contexts
TLDR
It is proved that on-policy imitation learning algorithms (with or without access to a queryable expert) are better equipped to handle these sorts of asymptotically realizable problems than off-policy methods and are able to avoid the latching behavior that plagues the latter.
Task-Oriented Deep Reinforcement Learning for Robotic Skill Acquisition and Control
TLDR
An efficient model-free off-policy actor–critic algorithm for robotic skill acquisition and continuous control is presented, by fusing the task reward with a task-oriented guiding reward, which is formulated by leveraging few and imperfect expert demonstrations.
Learning Online from Corrective Feedback: A Meta-Algorithm for Robotics
TLDR
This work unify prior work into a general corrective feedback meta-algorithm and shows that regardless of feedback the approach can learn quickly from a variety of noisy feedback.
Deep Reinforcement Learning for the Control of Robotic Manipulation: A Focussed Mini-Review
TLDR
This paper presents recent significant progress of deep reinforcement learning algorithms, which try to tackle the problems for the application in the domain of robotic manipulation control, such as sample efficiency and generalization.
Deep Reinforcement Learning for Soft Robotic Applications: Brief Overview with Impending Challenges
TLDR
An overview of various deep reinforcement learning algorithms along with instances of them being applied to real world scenarios and yielding state-of-the-art results are posits followed by brief descriptions on various pristine branches of DRL research that may be centers of future research in this field of interest.
...
...

References

SHOWING 1-10 OF 83 REFERENCES
Inverse Optimal Heuristic Control for Imitation Learning
TLDR
Inverse optimal heuristic control (IOHC) is presented, a novel approach to imitation learning that employs long-horizon IOC-style modeling in a low-dimensional space where inference remains tractable, while incorporating an additional descriptive set of BC-style features to guide a higher-dimensional overall action selection.
Learning to search: Functional gradient techniques for imitation learning
TLDR
The work presented extends the Maximum Margin Planning (MMP) framework to admit learning of more powerful, non-linear cost functions, and demonstrates practical real-world performance with three applied case-studies including legged locomotion, grasp planning, and autonomous outdoor unstructured navigation.
Learning to search: structured prediction techniques for imitation learning
TLDR
This thesis develops learning techniques that leverage the performance of modern robotic components that apply both specifically to training existing state-of-the-art planners as well as broadly to solving a range of structured prediction problems of importance in learning and robotics.
Robot Learning From Demonstration
TLDR
This work has shown that incorporating a task level direct learning component, which is non-model-based, in addition to the model-based planner, is useful in compensating for structural modeling errors and slow model learning.
Learning from Limited Demonstrations
TLDR
This work proves an upper bound on the Bellman error of the estimate computed by APID at each iteration, and shows empirically that APID outperforms pure Approximate Policy Iteration, a state-of-the-art LfD algorithm, and supervised learning in a variety of scenarios, including when very few and/or suboptimal demonstrations are available.
Stabilizing Human Control Strategies through Reinforcement Learning
TLDR
This paper proposes a new algorithm, rooted in reinforcement learning, for stabilizing learned models of human control strategy and illustrates how the resulting HCS models can be stabilized through reinforcement learning and finally reports some positive experimental results with the proposed algorithm.
Boosting Structured Prediction for Imitation Learning
TLDR
A novel approach, MMPBOOST, is provided, based on the functional gradient descent view of boosting, that extends MMP by "boosting" in new features by using simple binary classification or regression to improve performance of MMP imitation learning.
Maximum Entropy Inverse Reinforcement Learning
TLDR
A probabilistic approach based on the principle of maximum entropy that provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods is developed.
Apprenticeship learning via inverse reinforcement learning
TLDR
This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function.
Maximum margin planning
TLDR
This work learns mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior, and demonstrates a simple, provably efficient approach to structured maximum margin learning, based on the subgradient method, that leverages existing fast algorithms for inference.
...
...