• Publications
  • Influence
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering. Cognition is core to tasks thatExpand
  • 1,442
  • 245
  • PDF
Scene Graph Generation by Iterative Message Passing
Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, weExpand
  • 337
  • 78
  • PDF
Target-driven visual navigation in indoor scenes using deep reinforcement learning
Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new goals, and (2) data inefficiency, i.e., the model requires several (and often costly)Expand
  • 656
  • 66
  • PDF
Visual7W: Grounded Question Answering in Images
We have seen great progress in basic perceptual tasks such as object recognition and detection. However, AI models still fail to match humans in high-level vision tasks due to the lack of capacitiesExpand
  • 419
  • 64
  • PDF
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources. Prior works either extract information from the RGB imageExpand
  • 144
  • 40
  • PDF
AI2-THOR: An Interactive 3D Environment for Visual AI
We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at this http URL AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigateExpand
  • 189
  • 29
  • PDF
Reasoning about Object Affordances in a Knowledge Base Representation
Reasoning about objects and their affordances is a fundamental problem for visual intelligence. Most of the previous work casts this problem as a classification task where separate classifiers areExpand
  • 168
  • 12
  • PDF
SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark
Reproducibility has been a significant challenge in deep reinforcement learning and robotics research. Open-source frameworks and standardized benchmarks can serve an integral role in rigorousExpand
  • 54
  • 10
  • PDF
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction. NTP takesExpand
  • 83
  • 9
  • PDF
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
We propose a model-free deep reinforcement learning method that leverages a small amount of demonstration data to assist a reinforcement learning agent. We apply this approach to robotic manipulationExpand
  • 125
  • 6
  • PDF