Corpus ID: 11699847

Unsupervised Learning of Video Representations using LSTMs

@article{Srivastava2015UnsupervisedLO,
  title={Unsupervised Learning of Video Representations using LSTMs},
  author={Nitish Srivastava and Elman Mansimov and R. Salakhutdinov},
  journal={ArXiv},
  year={2015},
  volume={abs/1502.04681}
}
  • Nitish Srivastava, Elman Mansimov, R. Salakhutdinov
  • Published 2015
  • Computer Science, Mathematics
  • ArXiv
  • We use Long Short Term Memory (LSTM) networks to learn representations of video sequences. [...] Key Method We experiment with two kinds of input sequences - patches of image pixels and high-level representations ("percepts") of video frames extracted using a pretrained convolutional net. We explore different design choices such as whether the decoder LSTMs should condition on the generated output. We analyze the outputs of the model qualitatively to see how well the model can extrapolate the learned video…Expand Abstract

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 1,339 CITATIONS, ESTIMATED 94% COVERAGE

    Video Representation Learning by Dense Predictive Coding

    VIEW 1 EXCERPT
    CITES METHODS

    Matching video net: Memory-based embedding for video action recognition

    VIEW 1 EXCERPT
    CITES METHODS

    VIDEOWHISPER: Towards unsupervised learning of discriminative features of videos with RNN

    VIEW 6 EXCERPTS
    CITES METHODS
    HIGHLY INFLUENCED

    Action Recognition and Video Description using Visual Attention

    VIEW 2 EXCERPTS
    CITES BACKGROUND & METHODS

    Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction

    VIEW 1 EXCERPT
    CITES BACKGROUND

    FILTER CITATIONS BY YEAR

    2015
    2020

    CITATION STATISTICS

    • 162 Highly Influenced Citations

    • Averaged 309 Citations per year from 2018 through 2020

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 40 REFERENCES

    Long-term recurrent convolutional networks for visual recognition and description

    VIEW 1 EXCERPT

    Large-Scale Video Classification with Convolutional Neural Networks

    VIEW 3 EXCERPTS

    Show and tell: A neural image caption generator

    VIEW 2 EXCERPTS

    3D Convolutional Neural Networks for Human Action Recognition

    VIEW 2 EXCERPTS