• Publications
  • Influence
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
TLDR
A novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction, and achieves strong generalization across sequential tasks that exhibit hierarchal and compositional structures. Expand
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
TLDR
This work uses self-supervision to learn a compact and multimodal representation of sensory inputs, which can then be used to improve the sample efficiency of the policy learning of deep reinforcement learning algorithms. Expand
TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning
TLDR
Transition State Clustering with Deep Learning (TSC-DL), a new unsupervised algorithm that leverages video and kinematic data for task-level segmentation, and finds regions of the visual feature space that correlate with transition events using features constructed from layers of pre-trained image classification Deep Convolutional Neural Networks (CNNs). Expand
Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos
TLDR
The visually grounded action graph is introduced, a structured representation capturing the latent dependency between grounding and references in video, and a new reference-aware multiple instance learning objective for weak supervision of grounding in videos is proposed. Expand
Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter
TLDR
A version where distractor objects are heaped over the target object in a bin and success can be achieved in this long-horizon task with algorithmic policies in over 95% of instances and that the number of actions required scales approximately linearly with the size of the heap. Expand
DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image
TLDR
The Free-Form Deformation layer is a powerful new building block for Deep Learning models that manipulate 3D data and DEFORMNET uses this FFD layer combined with shape retrieval for smooth and detail-preserving 3D reconstruction of qualitatively plausible point clouds with respect to a single query image. Expand
Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations
TLDR
This work introduces Adversarially Robust Policy Learning (ARPL), an algorithm that leverages active computation of physically-plausible adversarial examples during training to enable robust policy learning in the source domain and robust performance under both random and adversarial input perturbations. Expand
Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration
TLDR
Neural Task Graph Networks are proposed, which use conjugate task graph as the intermediate representation to modularize both the video demonstration and the derived policy, and can effectively predict task structure on the JIGSAWS surgical dataset and generalize to unseen tasks. Expand
Transition State Clustering: Unsupervised Surgical Trajectory Segmentation for Robot Learning
TLDR
Transition State Clustering (TSC) models demonstrations as noisy realizations of a switched linear dynamical system, and learns spatially and temporally consistent transition events across demonstrations, and uses a hierarchical Dirichlet Process Gaussian Mixture Model to avoid having to select the number of segments a priori. Expand
ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation
TLDR
It is shown that the data obtained through RoboTurk enables policy learning on multi-step manipulation tasks with sparse rewards and that using larger quantities of demonstrations during policy learning provides benefits in terms of both learning consistency and final performance. Expand
...
1
2
3
4
5
...