• Publications
  • Influence
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of sceneExpand
VQA: Visual Question Answering
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural languageExpand
CIDEr: Consensus-based image description evaluation
TLDR
A novel paradigm for evaluating image descriptions that uses human consensus is proposed and a new automated metric that captures human judgment of consensus better than existing metrics across sentences generated by various sources is evaluated. Expand
Edge Boxes: Locating Object Proposals from Edges
TLDR
A novel method for generating object bounding box proposals using edges is proposed, showing results that are significantly more accurate than the current state-of-the-art while being faster to compute. Expand
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
TLDR
This work presents a diagnostic dataset that tests a range of visual reasoning abilities and uses this dataset to analyze a variety of modern visual reasoning systems, providing novel insights into their abilities and limitations. Expand
Microsoft COCO Captions: Data Collection and Evaluation Server
TLDR
The Microsoft COCO Caption dataset and evaluation server are described and several popular metrics, including BLEU, METEOR, ROUGE and CIDEr are used to score candidate captions. Expand
High-quality video view interpolation using a layered representation
TLDR
This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting. Expand
Structured Forests for Fast Edge Detection
TLDR
This paper forms the problem of predicting local edge masks in a structured learning framework applied to random decision forests and develops a novel approach to learning decision trees robustly maps the structured labels to a discrete space on which standard information gain measures may be evaluated. Expand
Fast Edge Detection Using Structured Forests
TLDR
This paper forms the problem of predicting local edge masks in a structured learning framework applied to random decision forests and develops a novel approach to learning decision trees robustly maps the structured labels to a discrete space on which standard information gain measures may be evaluated. Expand
Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
TLDR
The Inside-Outside Net (ION), an object detector that exploits information both inside and outside the region of interest, provides strong evidence that context and multi-scale representations improve small object detection. Expand
...
1
2
3
4
5
...