Corpus ID: 13903325

Seeing with Humans: Gaze-Assisted Neural Image Captioning

@article{Sugano2016SeeingWH,
  title={Seeing with Humans: Gaze-Assisted Neural Image Captioning},
  author={Yusuke Sugano and Andreas Bulling},
  journal={ArXiv},
  year={2016},
  volume={abs/1608.05203}
}
  • Yusuke Sugano, Andreas Bulling
  • Published 2016
  • Computer Science
  • ArXiv
  • Gaze reflects how humans process visual scenes and is therefore increasingly used in computer vision systems. Previous works demonstrated the potential of gaze for object-centric tasks, such as object localization and recognition, but it remains unclear if gaze can also be beneficial for scene-centric tasks, such as image captioning. We present a new perspective on gaze-assisted image captioning by studying the interplay between human gaze and the attention mechanism of deep neural networks… CONTINUE READING
    Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model
    212
    Can Saliency Information Benefit Image Captioning Models?
    3
    Paying Attention to Descriptions Generated by Image Captioning Models
    33
    Object Referring in Videos with Language and Human Gaze
    21
    Boosted Attention: Leveraging Human Attention for Image Captioning
    19
    Gaze360: Physically Unconstrained Gaze Estimation in the Wild
    14
    Paying More Attention to Saliency
    33
    Predicting the Category and Attributes of Mental Pictures Using Deep Gaze Pooling

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 72 REFERENCES
    GazeDPM: Early Integration of Gaze Information in Deformable Part Models
    10
    Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition
    121
    Predicting eye fixations using convolutional neural networks
    177
    Objects predict fixations better than early saliency.
    410
    SALICON: Saliency in Context
    307
    An Eye Fixation Database for Saliency Detection in Images
    194
    Predicting human gaze beyond pixels.
    188
    Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
    4935
    Learning to Recognize Daily Actions Using Gaze
    306
    Mind's eye: A recurrent visual representation for image caption generation
    362