Unsupervised Learning of Video Representations using LSTMs

Abstract

We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future… (More)
View Slides

Topics

15 Figures and Tables

Statistics

01002002015201620172018
Citations per Year

532 Citations

Semantic Scholar estimates that this publication has 532 citations based on the available data.

See our FAQ for additional information.

Blog articles referencing this paper

Slides referencing similar topics