• Publications
  • Influence
TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task. We introduce the Touchdown task and dataset, where an agent must first followExpand
  • 69
  • 17
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
We propose to directly map raw visual observations and text input to actions for instruction execution. While existing approaches assume access to structured environment representations or use aExpand
  • 103
  • 6
Tell me Dave: Context-sensitive grounding of natural language to manipulation instructions
It is important for a robot to be able to interpret natural language commands given by a human. In this paper, we consider performing a sequence of mobile manipulation tasks with instructionsExpand
  • 102
  • 5
Environment-Driven Lexicon Induction for High-Level Instructions
We focus on the task of interpreting complex natural language instructions to a robot, in which we must ground high-level commands such as microwave the cup to low-level actions such as grasping.Expand
  • 43
  • 5
RoboBrain: Large-Scale Knowledge Engine for Robots
In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge ofExpand
  • 107
  • 4
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
We propose to decompose instruction execution to goal prediction and action generation. We design a model that maps raw visual observations to goals using LINGUNET, a language-conditioned imageExpand
  • 42
  • 4
Lipschitz Continuity in Model-based Reinforcement Learning
Model-based reinforcement-learning methods learn transition and reward models and use them to guide behavior. We analyze the impact of learning models that are Lipschitz continuous---the distanceExpand
  • 33
  • 4
CHALET: Cornell House Agent Learning Environment
We present CHALET, a 3D house simulator with support for navigation and manipulation. CHALET includes 58 rooms and 10 house configuration, and allows to easily create new house and room layouts.Expand
  • 35
  • 2
Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning
Learning a generative model is a key component of model-based reinforcement learning. Though learning a good model in the tabular setting is a simple task, learning a useful model in the approximateExpand
  • 8
  • 2