Corpus ID: 14278057

Video Pixel Networks

@article{Kalchbrenner2017VideoPN,
  title={Video Pixel Networks},
  author={Nal Kalchbrenner and A. Oord and K. Simonyan and Ivo Danihelka and Oriol Vinyals and A. Graves and K. Kavukcuoglu},
  journal={ArXiv},
  year={2017},
  volume={abs/1610.00527}
}
  • Nal Kalchbrenner, A. Oord, +4 authors K. Kavukcuoglu
  • Published 2017
  • Computer Science
  • ArXiv
  • We propose a probabilistic video model, the Video Pixel Network (VPN), that estimates the discrete joint distribution of the raw pixel values in a video. The model and the neural architecture reflect the time, space and color structure of video tensors and encode it as a four-dimensional dependency chain. The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground… CONTINUE READING
    Generating Videos with Scene Dynamics
    • 785
    • PDF
    Unsupervised Learning of Disentangled Representations from Video
    • 284
    • PDF
    Stochastic Video Generation with a Learned Prior
    • 176
    • PDF
    Video-to-Video Synthesis
    • 302
    • PDF
    Learning to Decompose and Disentangle Representations for Video Prediction
    • 91
    • Highly Influenced
    • PDF
    Flexible Spatio-Temporal Networks for Video Prediction
    • 47
    • PDF
    Neural Discrete Representation Learning
    • 432
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 23 REFERENCES
    Generating Videos with Scene Dynamics
    • 785
    • PDF
    Pixel Recurrent Neural Networks
    • 1,228
    • PDF
    Deep multi-scale video prediction beyond mean square error
    • 1,084
    • PDF
    Conditional Image Generation with PixelCNN Decoders
    • 1,052
    • PDF
    Video (language) modeling: a baseline for generative models of natural videos
    • 298
    • PDF
    Unsupervised Learning of Video Representations using LSTMs
    • 1,432
    • Highly Influential
    • PDF
    Multi-Scale Context Aggregation by Dilated Convolutions
    • 3,459
    • PDF
    Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
    • 2,444
    • PDF
    Spatio-temporal video autoencoder with differentiable memory
    • 211
    • Highly Influential
    • PDF
    Action-Conditional Video Prediction using Deep Networks in Atari Games
    • 549
    • PDF