• Publications
  • Influence
nocaps: novel object captioning at scale
TLDR
We introduce nocaps, the first large-scale benchmark for novel object captioning, containing nearly 400 novel object classes, and provide analysis to guide future work. Expand
  • 32
  • 8
  • PDF
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
TLDR
We propose a new class of probabilistic neural-symbolic models, that have symbolic functional programs as a latent, stochastic variable. Expand
  • 23
  • 1
  • PDF
Continual Reinforcement Learning in 3D Non-stationary Environments
TLDR
We propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Expand
  • 11
  • 1
  • PDF
VirTex: Learning Visual Representations from Textual Annotations
TLDR
We propose VirTex -- a pretraining approach using semantically dense captions to learn visual representations. Expand
  • 8
  • 1
  • PDF
CASTing Your Model: Learning to Localize Improves Self-Supervised Representations
TLDR
We propose Contrastive Attention-Supervised Tuning (CAST), a training method to improve the visual grounding ability of contrastive SSL models and propose a solution to overcome them. Expand