• Corpus ID: 229158088

Human-in-the-Loop Imitation Learning using Remote Teleoperation

@article{Mandlekar2020HumanintheLoopIL,
  title={Human-in-the-Loop Imitation Learning using Remote Teleoperation},
  author={Ajay Mandlekar and Danfei Xu and Roberto Mart'in-Mart'in and Yuke Zhu and Li Fei-Fei and Silvio Savarese},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.06733}
}
Imitation Learning is a promising paradigm for learning complex robot manipulation skills by reproducing behavior from human demonstrations. However, manipulation tasks often contain bottleneck regions that require a sequence of precise actions to make meaningful progress, such as a robot inserting a pod into a coffee machine to make coffee. Trained policies can fail in these regions because small deviations in actions can lead the policy into states not covered by the demonstrations… 

Figures and Tables from this paper

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
TLDR
This study analyzes the most critical challenges when learning from offline human data for manipulation and highlights opportunities for learning from human datasets, such as the ability to learn proficient policies on challenging, multi-stage tasks beyond the scope of current reinforcement learning methods.
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
TLDR
An extensive study of offline learning algorithms for robot manipulation on five simulated and three real-world multi-stage manipulation tasks of varying complexity, and with datasets of varying quality.
Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization
TLDR
A novel human-in-the-loop learning method called Human-AI Copilot Optimization (HACO), which extracts proxy state-action values from partial human demonstration and optimizes the agent to improve the proxy values while reducing the human interventions.
Correct Me If I am Wrong: Interactive Learning for Robotic Manipulation
TLDR
The proposed CEILing (Corrective and Evaluative Interactive Learning) framework combines both corrective and evaluative feedback from the teacher to train a stochastic policy in an asynchronous manner, and employs a dedicated mechanism to trade off human corrections with the robot’s own experience.
LazyDAgger: Reducing Context Switching in Interactive Imitation Learning
TLDR
LazyDAgger is presented, which extends the interactive imitation learning (IL) algorithm SafeDAgger to reduce context switches between supervisor and autonomous control and improves the performance and robustness of the learned policy during both learning and execution while limiting burden on the supervisor.
ReIL: A Framework for Reinforced Intervention-based Imitation Learning
TLDR
Experimental results from real world mobile robot navigation challenges indicate that ReIL learns rapidly from sparse supervisor corrections without suffering deterioration in performance that is characteristic of supervised learning-based methods such as HG-Dagger and IWR.
Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos
TLDR
This paper presents Learning by Watching (LbW), an algorithmic framework for policy learning through imitation from a single video specifying the task, and learns unsupervised human to robot translation to overcome the morphology mis-match issue.
Learning Embodied Agents with Scalably-Supervised Reinforcement Learning
TLDR
This thesis considers alternative modalities of supervision that can be more scalable and easier to provide from the human user and shows that such supervision can drastically improve the agent’s learning efficiency, enabling the agent to do directed exploration and learning within a large search space of states.
ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning
TLDR
ThriftyDAgger, an algorithm for actively querying a human supervisor given a desired budget of human interventions, is presented and a novel metric for estimating risk under the current robot policy is introduced.
A Framework for Composite Layup Skill Learning and Generalizing Through Teleoperation
In this article, an impedance control-based framework for human-robot composite layup skill transfer was developed, and the human-in-the-loop mechanism was investigated to achieve human-robot skill
...
1
2
...

References

SHOWING 1-10 OF 46 REFERENCES
Learning from Interventions: Human-robot interaction as both explicit and implicit feedback
TLDR
This work argues that learning interactively from expert interventions enjoys the best of both worlds, and formalizes this as a constraint on the learner’s value function, which it can efficiently learn using no regret, online learning techniques.
DART: Noise Injection for Robust Imitation Learning
TLDR
A new algorithm is proposed, DART (Disturbances for Augmenting Robot Trajectories), that collects demonstrations with injected noise, and optimizes the noise level to approximate the error of the robot's trained policy during data collection.
Comparing human-centric and robot-centric sampling for robot deep learning from demonstrations
TLDR
It is observed in simulation that for linear SVMs, policies learned with RC outperformed those learned with HC but that using deep models this advantage disappears, and it is proved there exists a class of examples in which at the limit, HC is guaranteed to converge to an optimal policy while RC may fail to converge.
Interactive Policy Learning through Confidence-Based Autonomy
TLDR
The algorithm selects demonstrations based on a measure of action selection confidence, and results show that using Confident Execution the agent requires fewer demonstrations to learn the policy than when demonstrations are selected by a human teacher.
HG-DAgger: Interactive Imitation Learning with Human Experts
TLDR
HG-DAgger is proposed, a variant of DAgger that is more suitable for interactive imitation learning from human experts in real-world systems and learns a safety threshold for a model-uncertainty-based risk metric that can be used to predict the performance of the fully trained novice in different regions of the state space.
ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation
TLDR
It is shown that the data obtained through RoboTurk enables policy learning on multi-step manipulation tasks with sparse rewards and that using larger quantities of demonstrations during policy learning provides benefits in terms of both learning consistency and final performance.
End-to-End Robotic Reinforcement Learning without Reward Engineering
TLDR
This paper proposes an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.
Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation
TLDR
It is described how consumer-grade Virtual Reality headsets and hand tracking hardware can be used to naturally teleoperate robots to perform complex tasks and how imitation learning can learn deep neural network policies that can acquire the demonstrated skills.
Helping Robots Learn: A Human-Robot Master-Apprentice Model Using Demonstrations via Virtual Reality Teleoperation
TLDR
The master-apprentice model augments self-supervised learning with learning by demonstration, efficiently using the human’s time and expertise while facilitating future scalability to supervision of multiple robots.
Learning from Physical Human Corrections, One Feature at a Time
TLDR
The approach allows the human-robot team to focus on learning one feature at a time, unlike state-of-the-art techniques that update all features at once, and suggests that users teaching one-at-a-time perform better, especially in tasks that require changing multiple features.
...
1
2
3
4
5
...