Informative Image Captioning with External Sources of Information

@article{Zhao2019InformativeIC,
  title={Informative Image Captioning with External Sources of Information},
  author={Sanqiang Zhao and Piyush Sharma and Tomer Levinboim and Radu Soricut},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.08876}
}
  • Sanqiang Zhao, Piyush Sharma, +1 author Radu Soricut
  • Published 2019
  • Computer Science
  • ArXiv
  • An image caption should fluently present the essential information in a given image, including informative, fine-grained entity mentions and the manner in which these entities interact. However, current captioning models are usually trained to generate captions that only contain common object names, thus falling short on an important "informativeness" dimension. We present a mechanism for integrating image information together with fine-grained labels (assumed to be generated by some upstream… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Paper Mentions

    Improving Image Captioning with Better Use of Captions
    • 1
    • PDF
    ICECAP: Information Concentrated Entity-aware Image Captioning
    Transform and Tell: Entity-Aware News Image Captioning
    • 4
    • PDF
    VisualNews : Benchmark and Challenges in Entity-aware Image Captioning
    VisualNews : A Large Multi-source News Image Dataset
    Reinforcing an Image Caption Generator Using Off-Line Human Feedback
    • 1
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 26 REFERENCES
    Entity-aware Image Caption Generation
    • 19
    • Highly Influential
    • PDF
    Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning
    • 135
    • PDF
    Boosting Image Captioning with Attributes
    • 330
    • PDF
    Image Captioning with Semantic Attention
    • 859
    • PDF
    From captions to visual concepts and back
    • 928
    • PDF
    Show and tell: A neural image caption generator
    • 3,419
    • PDF
    Bottom-Up and Top-Down Attention for Image Captioning and VQA
    • 231
    Deep visual-semantic alignments for generating image descriptions
    • 1,839