• Corpus ID: 8142135

Pixel Recurrent Neural Networks

@article{Oord2016PixelRN,
  title={Pixel Recurrent Neural Networks},
  author={A{\"a}ron van den Oord and Nal Kalchbrenner and Koray Kavukcuoglu},
  journal={ArXiv},
  year={2016},
  volume={abs/1601.06759}
}
Modeling the distribution of natural images is a landmark problem in unsupervised learning. [] Key Method Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also…
PixelVAE: A Latent Variable Model for Natural Images
Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and model global structure well but have difficulty
PixelCNN Models with Auxiliary Variables for Natural Image Modeling
TLDR
Two new generative image models that exploit different image transformations as auxiliary variables are described that produce much more realistically looking image samples than previous state-of-the-art probabilistic models.
Learning Resolution-independent Image Representations
  • J. Hammer, Michael S. Gashler
  • Computer Science
    2018 IEEE 17th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)
  • 2018
TLDR
A collection of methods that enable analogous capabilities in deep neural networks to represent images with continuous resolution-independent representations using an MCMC algorithm that directs attention during the learning phase to regions of the image that deviate from the current model.
Latent Variable PixelCNNs for Natural Image Modeling
TLDR
Benefits of the LatentPixelCNN models are experimentally demonstrated, showing that they produce much more realistically looking image samples than previous state-of-the-art probabilistic models.
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling
TLDR
It is shown that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation over baseline convolutional architectures and the state-of-the-art among the models within the same class.
Deep Probabilistic Modeling of Natural Images using a Pyramid Decomposition
TLDR
A new technique for probabilistic modeling of natural images that combines the advantages of classic multi-scale and modern deep learning models is introduced and achieves new state of theart image modeling performance on the CIFAR-10 dataset and at the same time is much faster than competitive models.
SHAMANN: Shared Memory Augmented Neural Networks
TLDR
Shared memory augmented neural network actors are proposed as a dynamically scalable alternative for semantic segmentation based on a decomposition of the image into a sequence of local patches, and such actors are trained to sequentially segment each patch.
Deep Markov Random Field for Image Modeling
TLDR
A novel MRF model that uses fully-connected neurons to express the complex interactions among pixels is proposed that combines the expressive power of deep neural networks and the cyclic dependency structure of MRF in a unified model, bringing the modeling capability to a new level.
Locally Masked Convolution for Autoregressive Models
TLDR
LMConv is introduced: a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image, achieving improved performance on whole-image density estimation and globally coherent image completions.
Generative Pretraining From Pixels
TLDR
This work trains a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure, and finds that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 38 REFERENCES
Generative Image Modeling Using Spatial LSTMs
TLDR
This work introduces a recurrent image model based on multidimensional long short-term memory units which is particularly suited for image modeling due to their spatial structure and outperforms the state of the art in quantitative comparisons on several image datasets and produces promising results when used for texture synthesis and inpainting.
DRAW: A Recurrent Neural Network For Image Generation
TLDR
The Deep Recurrent Attentive Writer neural network architecture for image generation substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye.
Factoring Variations in Natural Images with Deep Gaussian Mixture Models
TLDR
This paper proposes a new scalable deep generative model for images, called the Deep Gaussian Mixture Model, that is a straightforward but powerful generalization of GMMs to multiple layers, and shows that deeper GMM architectures generalize better than more shallow ones.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Deep AutoRegressive Networks
TLDR
An efficient approximate parameter estimation method based on the minimum description length (MDL) principle is derived, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference.
MADE: Masked Autoencoder for Distribution Estimation
TLDR
This work introduces a simple modification for autoencoder neural networks that yields powerful generative models and proves that this approach is competitive with state-of-the-art tractable distribution estimators.
A Deep and Tractable Density Estimator
TLDR
This work introduces an efficient procedure to simultaneously train a NADE model for each possible ordering of the variables, by sharing parameters across all these models.
Deep Boltzmann Machines
TLDR
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.
Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation
TLDR
This work re-arranges the traditional cuboid order of computations in MD-LSTM in pyramidal fashion and achieves best known pixel-wise brain image segmentation results on MRBrainS13 (and competitive results on EM-ISBI12).
...
1
2
3
4
...