Learning and Retrieval from Prior Data for Skill-based Imitation Learning

  title={Learning and Retrieval from Prior Data for Skill-based Imitation Learning},
  author={Soroush Nasiriany and Tian Gao and Ajay Mandlekar and Yuke Zhu},
: Imitation learning offers a promising path for robots to learn general- 1 purpose tasks, but traditionally has enjoyed limited scalability due to high data 2 supervision requirements and brittle generalization. Inspired by recent work on 3 skill-based imitation learning, we investigate whether leveraging prior data from 4 previous related tasks can enable learning novel tasks in a more robust, data- 5 efficient manner. To make effective use of the prior data, the agent must inter- 6 nalize… 

Figures and Tables from this paper



Accelerating Reinforcement Learning with Learned Skill Priors

This work proposes a deep latent variable model that jointly learns an embedding space of skills and the skill prior from offline agent experience, and extends common maximum-entropy RL approaches to use skill priors to guide downstream learning.

Hierarchical Few-Shot Imitation with Skill Transition Models

FIST is capable of generalizing to new tasks and substantially outperforms prior baselines in navigation experiments requiring traversing unseen parts of a large maze and 7-DoF robotic arm experiments requiring manipulating previously unseen objects in a kitchen.

Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations

This work presents Generalization Through Imitation (GTI), a two-stage offline imitation learning algorithm that exploits this intersecting structure to train goal-directed policies that generalize to unseen start and goal state combinations.

Demonstration-Guided Reinforcement Learning with Learned Skills

Skill-based Learning with Demonstrations (SkiLD) is proposed, an algorithm for demonstration-guided RL that efficiently leverages the provided demonstrations by following the demonstrated skills instead of the primitive actions, resulting in substantial performance improvements over prior demonstration- guided RL approaches.

COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning

It is shown that even when the prior data does not actually succeed at solving the new task, it can still be utilized for learning a better policy, by providing the agent with a broader understanding of the mechanics of its environment.

BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning

An interactive and exible imitation learning system that can learn from both demonstrations and interventions and can be conditioned on different forms of information that convey the task, including pretrained embeddings of natural language or videos of humans performing the task.

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

The theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning, effectively reducing the need for large near-optimal expert datasets through the use of aux-iliary non-expert data.

Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning

This work simplifies the long-horizon policy learning problem by using a novel data-relabeling algorithm for learning goal-conditioned hierarchical policies, where the low-level only acts for a fixed number of steps, regardless of the goal achieved.

A Framework for Efficient Robotic Manipulation

It is shown that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels, such as reaching, picking, moving, pulling a large object, flipping a switch, and opening a drawer in just 15-50 minutes of real-world training time.

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation

This study analyzes the most critical challenges when learning from offline human data for manipulation and highlights opportunities for learning from human datasets, such as the ability to learn proficient policies on challenging, multi-stage tasks beyond the scope of current reinforcement learning methods.