From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
@article{Cui2022FromPT, title={From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data}, author={Zichen Jeff Cui and Yibin Wang and Nur Muhammad (Mahi) Shafiullah and Lerrel Pinto}, journal={ArXiv}, year={2022}, volume={abs/2210.10047} }
While large-scale sequence modeling from offline data has led to impressive performance gains in natural language and image generation, directly translating such ideas to robotics has been challenging. One critical reason for this is that uncurated robot demonstration data, i.e. play data, collected from non-expert human demonstrators are often noisy, diverse, and distributionally multi-modal. This makes extracting useful, task-centric behaviors from such data a difficult generative modeling…
Figures and Tables from this paper
One Citation
Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
- Computer Science
- 2022
DIAL is introduced, which utilizes semi-supervised language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data and then train language-conditioned policies on the augmented datasets, enabling cheaper acquisition of useful language descriptions compared to expensive human labels.
Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models
- Computer ScienceArXiv
- 2022
This work proposes a simple yet effective model for robots to solve instruction-following tasks in vision-based environments that outperforms all state-of-the-art pre-trained or trained-from-scratch methods in both single-task and multi-task settings.
InstructRL: Simple yet Effective Instruction-Following Agents with Multimodal Transformer
- Computer Science
- 2022
This work proposes a simple yet effective model for robots to solve instruction-following tasks in vision-based environments that outperforms all state-of-the-art pre-trained or trained-from-scratch methods in both single-task and multi-task settings.
References
SHOWING 1-10 OF 52 REFERENCES
Behavior Transformers: Cloning k modes with one stone
- Computer ScienceArXiv
- 2022
Behavior Transformer is presented, a new technique to model unlabeled demonstration data with multiple modes and improves over prior state-of-the-art work on solving demonstrated tasks while capturing the major modes present in the pre-collected datasets.
Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations
- Computer ScienceRobotics: Science and Systems
- 2020
This work presents Generalization Through Imitation (GTI), a two-stage offline imitation learning algorithm that exploits this intersecting structure to train goal-directed policies that generalize to unseen start and goal state combinations.
Learning Latent Plans from Play
- Computer ScienceCoRL
- 2019
Play-LMP is introduced, a method designed to handle variability in the LfP setting by organizing it in an embedding space and finding that play-supervised models, unlike their expert-trained counterparts, are more robust to perturbations and exhibit retrying-till-success.
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
- Computer ScienceICLR
- 2021
This paper proposes a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials from a wide range of previously seen tasks, and shows how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
Demonstration-Bootstrapped Autonomous Practicing via Multi-Task Reinforcement Learning
- Computer ScienceArXiv
- 2022
This work proposes a system for reinforcement learning that leverages multi-task reinforcement learning bootstrapped with prior data to enable continuous autonomous practicing, minimizing the number of resets needed while being able to learn temporally extended behaviors.
Playful Interactions for Representation Learning
- Computer Science, Psychology2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2022
This work proposes to use playful interactions in a self-supervised manner to learn visual representations for downstream tasks and demonstrates that these representations generalize better than standard behavior cloning and can achieve similar performance with only half the number of required demonstrations.
Towards More Generalizable One-shot Visual Imitation Learning
- Computer Science2022 International Conference on Robotics and Automation (ICRA)
- 2022
MOSAIC (Multi-task One-Shot Imitation with self-Attention and Contrastive learning), which integrates a self-attention model architecture and a temporal contrastive module to enable better task disambiguation and more robust representation learning is proposed.
Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration
- Computer Science2018 IEEE International Conference on Robotics and Automation (ICRA)
- 2018
It is demonstrated that it is possible to learn complex manipulation tasks, such as picking up a towel, wiping an object, and depositing the towel to its previous position, entirely from raw images with direct behavior cloning.
Reinforcement Learning as One Big Sequence Modeling Problem
- Computer ScienceNeurIPS
- 2021
This work explores how RL can be reframed as “one big sequence modeling” problem, using state-of-the-art Transformer architectures to model distributions over sequences of states, actions, and rewards.
R3M: A Universal Visual Representation for Robot Manipulation
- Computer ScienceArXiv
- 2022
This work pre-train a visual representation using the Ego4D human video dataset using a combination of time-contrastive learning, video-language alignment, and an L1 penalty to encourage sparse and compact representations, resulting in R3M.