• Publications
  • Influence
TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
TLDR
This work introduces the Touchdown task and dataset, where an agent must first follow navigation instructions in a Street View environment to a goal position, and then guess a location in its observed environment described in natural language to find a hidden object.
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
TLDR
The algorithm provably explores the environment with sample complexity scaling polynomially in the number of latent states and the time horizon, and with no dependence on the size of the observation space, which could be infinitely large, which enables sample-efficient global policy optimization for any reward function.
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
TLDR
This work learns a single model to jointly reason about linguistic and visual input in a contextual bandit setting to train a neural network agent and shows significant improvements over supervised learning and common reinforcement learning variants.
Lipschitz Continuity in Model-based Reinforcement Learning
TLDR
This work analyzes the impact of learning models that are Lipschitz continuous---the distance between function values for two inputs is bounded by a linear function of the distance between the inputs and proves an error bound for the value-function estimate arising from such models.
Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations
TLDR
This work bias the search procedure towards semantic parses that are more compatible to the text, which provide better supervision signals for training and leads to a new state-of-the-art model that outperforms previous work by 5.0% absolute on exact match accuracy.
Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning
TLDR
Using tools from Lipschitz continuity, it is shown that minimizing the VAML objective is in fact equivalent to minimizing the Wasserstein metric.
Investigating the Role of Negatives in Contrastive Representation Learning
TLDR
Theoretically, the existence of a collision-coverage trade-off is shown suggesting that the optimal number of negative examples should scale with the number of underlying concepts in the data, and empirical results find that the results broadly agree with theory and suggest future directions to better align theory and practice.
Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning
TLDR
Using tools from Lipschitz continuity, it is shown that minimizing the VAML objective is in fact equivalent to minimizing the Wasserstein metric.
Understanding Contrastive Learning Requires Incorporating Inductive Biases
TLDR
It is demonstrated that analyses that ignore inductive biases of the function class and training algorithm cannot adequately explain the success of contrastive learning, even provably leading to vacuous guarantees in some settings.
Combating the Compounding-Error Problem with a Multi-step Model
TLDR
This paper addresses the compounding-error problem by introducing a multi-step model that directly outputs the outcome of executing a sequence of actions, and it yields better action selection compared to the one- step model.
...
...