• Publications
  • Influence
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
TLDR
A self-adaptive visual navigation method (SAVN) which learns to adapt to new environments without any explicit supervision which shows major improvements in both success rate and SPL for visual navigation in novel scenes. Expand
SeGAN: Segmenting and Generating the Invisible
TLDR
This paper studies the challenging problem of completing the appearance of occluded objects and proposes a novel solution, SeGAN, which outperforms state-of-the-art segmentation baselines for the invisible parts of objects. Expand
Watching the World Go By: Representation Learning from Unlabeled Videos
TLDR
Video Noise Contrastive Estimation is proposed, a method for using unlabeled video to learn strong, transferable single image representations that demonstrate improvements over recent unsupervised single image techniques, as well as over fully supervised ImageNet pretraining, across a variety of temporal and non-temporal tasks. Expand
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
TLDR
This paper addresses the problem of inferring contact points and the physical forces from videos of humans interacting with objects by using a physics simulator to predict effects, and enforce that estimated forces must lead to same effect as depicted in the video. Expand
Contrasting Contrastive Self-Supervised Representation Learning Models
TLDR
This paper analyzes contrastive approaches as one of the most successful and popular variants of self-supervised representation learning and examines over 700 training experiments including 30 encoders, 4 pre-training datasets and 20 diverse downstream tasks. Expand
ManipulaTHOR: A Framework for Visual Object Manipulation
TLDR
This work proposes a framework for object manipulation built upon the physics-enabled, visually rich AI2-THOR framework and presents a new challenge to the Embodied AI community known as ArmPointNav, which extends the popular point navigation task to object manipulation and offers new challenges including 3D obstacle avoidance. Expand
Who Let the Dogs Out? Modeling Dog Behavior from Visual Data
TLDR
This model takes visual information as input and directly predicts the actions of the agent, and the representation learned by the model encodes distinct information compared to representations trained on image classification, and this learned representation can generalize to other domains. Expand
Learning Generalizable Visual Representations via Interactive Gameplay
TLDR
This work is the first to show that embodied adversarial reinforcement learning agents playing cache, a variant of hide-and-seek, in a high fidelity, interactive, environment, learn representations of their observations encoding information such as occlusion, object permanence, free space, and containment. Expand
Act the Part: Learning Interaction Strategies for Articulated Object Part Discovery
TLDR
Act the Part is introduced to learn how to interact with articulated objects to discover and segment their pieces by coupling action selection and motion segmentation, and is able to isolate structures to make perceptual part recovery possible without semantic labels. Expand
...
1
2
...