Video Description with Spatial-Temporal Attention

  title={Video Description with Spatial-Temporal Attention},
  author={Yunbin Tu and Xishan Zhang and Bingtao Liu and Chenggang Clarence Yan},
  booktitle={ACM Multimedia},
Temporal attention has been widely used in video description to adaptively focus on important frames. However, most existing methods based on temporal attention suffer from the problems of recognition error and detail missing, because only coarse frame-level global features are employed. Inspired by recent successful work in image description using spatial attention, we propose a spatial-temporal attention (STAT) method to address such problems. In particular, first, we take advantage of object… CONTINUE READING


Publications referenced by this paper.
Showing 1-10 of 14 references

Similar Papers

Loading similar papers…