DenseCap: Fully Convolutional Localization Networks for Dense Captioning

Abstract

We introduce the dense captioning task, which requires a computer vision system to both localize and describe salient regions in images in natural language. The dense captioning task generalizes object detection when the descriptions consist of a single word, and Image Captioning when one predicted region covers the full image. To address the localization… (More)
DOI: 10.1109/CVPR.2016.494

Topics

Statistics

01002002015201620172018
Citations per Year

344 Citations

Semantic Scholar estimates that this publication has 344 citations based on the available data.

See our FAQ for additional information.

  • Presentations referencing similar topics