• Publications
  • Influence
Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?
We conduct large-scale studies on `human attention' in Visual Question Answering (VQA) to understand where humans choose to look to answer questions about images. We design and test multipleExpand
  • 224
  • 21
  • PDF
Object-Proposal Evaluation Protocol is ‘Gameable’
Object proposals have quickly become the de-facto preprocessing step in a number of vision pipelines (for object detection, object discovery, and other tasks). Their performance is usually evaluatedExpand
  • 65
  • 5
  • PDF
nocaps: novel object captioning at scale
Image captioning models have achieved impressive results on datasets containing limited visual concepts and large amounts of paired image-caption training data. However, if these models are to everExpand
  • 21
  • 5
  • PDF
EvalAI: Towards Better Evaluation Systems for AI Agents
We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution toExpand
  • 14
  • 3
  • PDF
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
Diverse and accurate vision+language modeling is an important goal to retain creative freedom and maintain user engagement. However, adequately capturing the intricacies of diversity in languageExpand
  • 11
  • 3
  • PDF
CloudCV: Large-Scale Distributed Computer Vision as a Cloud Service
We are witnessing a proliferation of massive visual data. Unfortunately scaling existing computer vision algorithms to large datasets leaves researchers repeatedly solving the same algorithmic,Expand
  • 41
  • 2
  • PDF
Sort Story: Sorting Jumbled Images and Captions into Stories
Temporal common sense has applications in AI tasks such as QA, multi-document summarization, and human-AI communication. We propose the task of sequencing -- given a jumbled set of alignedExpand
  • 33
  • 1
  • PDF
CloudCV: Deep Learning and Computer Vision on the Cloud
We are witnessing a proliferation of massive visual data. Visual content is arguably the fastest growing data on the web. Photo-sharing websites like Flickr and Facebook now host more than 6 and 90Expand
  • 1
Fabrik: An Online Collaborative Neural Network Editor
We present Fabrik, an online neural network editor that provides tools to visualize, edit, and share neural networks from within a browser. Fabrik provides a simple and intuitive GUI to import neuralExpand
  • 2
  • PDF