Long-term recurrent convolutional networks for visual recognition and description

@article{Donahue2015LongtermRC,
  title={Long-term recurrent convolutional networks for visual recognition and description},
  author={J. Donahue and Lisa Anne Hendricks and Marcus Rohrbach and Subhashini Venugopalan and S. Guadarrama and Kate Saenko and Trevor Darrell},
  journal={2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2015},
  pages={2625-2634}
}
  • J. Donahue, Lisa Anne Hendricks, +4 authors Trevor Darrell
  • Published 2015
  • Computer Science, Medicine
  • 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or “temporally deep”, are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video… CONTINUE READING
    3,404 Citations
    Making Convolutional Networks Recurrent for Visual Sequence Learning
    • 17
    • PDF
    Delving Deeper into Convolutional Networks for Learning Video Representations
    • 327
    • PDF
    Describing Videos by Exploiting Temporal Structure
    • 749
    • PDF
    Multi-Level Recurrent Residual Networks for Action Recognition
    • 3
    • PDF
    Learning Contextual Dependence With Convolutional Hierarchical Recurrent Neural Networks
    • 32
    • PDF
    Lattice Long Short-Term Memory for Human Action Recognition
    • 81
    • PDF
    A Study on the use of State-of-the-Art CNNs with Fine Tuning for Spatial Stream Generation for Activity Recognition
    • M. Ranjit, G. Ganapathy
    • Computer Science
    • 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT)
    • 2019
    Recurrent Encoder-Decoder Networks for Time-Varying Dense Prediction
    • 4
    • PDF
    Deep Contextual Recurrent Residual Networks for Scene Labeling
    • 14
    • PDF
    Recurrent Spatiotemporal Feature Learning for Action Recognition

    References

    SHOWING 1-10 OF 88 REFERENCES
    Describing Videos by Exploiting Temporal Structure
    • 749
    • PDF
    Beyond short snippets: Deep networks for video classification
    • 1,621
    • PDF
    Explain Images with Multimodal Recurrent Neural Networks
    • 288
    • Highly Influential
    • PDF
    Large-Scale Video Classification with Convolutional Neural Networks
    • 4,317
    • Highly Influential
    • PDF
    Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
    • 847
    • Highly Influential
    • PDF
    Two-Stream Convolutional Networks for Action Recognition in Videos
    • 4,290
    • Highly Influential
    • PDF
    Deep visual-semantic alignments for generating image descriptions
    • 1,881
    • Highly Influential
    Show and tell: A neural image caption generator
    • 3,649
    • PDF
    DeViSE: A Deep Visual-Semantic Embedding Model
    • 1,508
    • PDF