• Corpus ID: 237370930

Back to Reality for Imitation Learning

@article{Johns2021BackTR,
  title={Back to Reality for Imitation Learning},
  author={Edward Johns},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.12867}
}
Imitation learning, and robot learning in general, emerged due to break1 throughs in machine learning, rather than breakthroughs in robotics. As such, 2 evaluation metrics for robot learning are deeply rooted in those for machine learn3 ing. In this paper, we expose a worrying trend that this has led to, with a meta4 analysis of imitation learning papers accepted at CoRL 2020. We show that tra5 ditional evaluation metrics, which only encourage data efficiency, do not con6 sider how the robot… 

Figures and Tables from this paper

Learning Multi-Stage Tasks with One Demonstration via Self-Replay
In this work, we introduce a novel method to learn everyday-like multistage tasks from a single human demonstration, without requiring any prior object knowledge. Inspired by the recent

References

SHOWING 1-10 OF 30 REFERENCES
An Algorithmic Perspective on Imitation Learning
TLDR
This work provides an introduction to imitation learning, dividing imitation learning into directly replicating desired behavior and learning the hidden objectives of the desired behavior from demonstrations (called inverse optimal control or inverse reinforcement learning [Russell, 1998]).
Visual Imitation Made Easy
TLDR
This work presents an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots and uses commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
Positive-Unlabeled Reward Learning
TLDR
This paper shows that by applying a large-scale PU learning algorithm to the reward learning problem, this approach drastically improves both GAIL and supervised reward learning, without any additional assumptions.
Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
TLDR
An optimization based method is developed to estimate a coarse 3D state representation for the hand and the manipulated object(s) without requiring any supervision, and uses these trajectories as dense rewards for an agent that learns to mimic them through reinforcement learning.
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural Rewards
TLDR
This paper considers a variety of difficult industrial insertion tasks with visual inputs and different natural reward specifications, namely sparse rewards and goal images and shows that methods that combine RL with prior information can solve these tasks from a reasonable amount of real-world interaction.
Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration
  • Edward Johns
  • Computer Science
    2021 IEEE International Conference on Robotics and Automation (ICRA)
  • 2021
We introduce a simple new method for visual imitation learning, which allows a novel robot manipulation task to be learned from a single human demonstration, without requiring any prior knowledge of
Learning Multi-Stage Tasks with One Demonstration via Self-Replay
In this work, we introduce a novel method to learn everyday-like multistage tasks from a single human demonstration, without requiring any prior object knowledge. Inspired by the recent
Transformers for One-Shot Visual Imitation
TLDR
This paper investigates techniques which allow robots to partially bridge domain gaps, using their past experience, and hypothesizes that their policy representations must be both context driven and dynamics aware in order to perform these tasks.
Learning from Demonstrations using Signal Temporal Logic
TLDR
Signal Temporal Logic is used to evaluate and rank the quality of demonstrations, and it is shown that this approach outperforms the state-of-the-art Maximum Causal Entropy Inverse Reinforcement Learning.
Tolerance-Guided Policy Learning for Adaptable and Transferrable Delicate Industrial Insertion
TLDR
The results show that RS-GAIL can efficiently learn optimal policies under sparse rewards; the tolerance embedding can enhance the transferability of the learned policy; and the probabilistic inference makes the policy robust to defects on the workpieces.
...
1
2
3
...