Corpus ID: 15624550

Latent Variable PixelCNNs for Natural Image Modeling

  title={Latent Variable PixelCNNs for Natural Image Modeling},
  author={Alexander Kolesnikov and Christoph H. Lampert},
We study probabilistic models of natural images and extend the autoregressive family of PixelCNN models by incorporating latent variables. Subsequently, we describe two new generative image models that exploit different image transformations as latent variables: a quantized grayscale view of the image or a multi-resolution image pyramid. The proposed models tackle two known shortcomings of existing PixelCNN models: 1) their tendency to focus on low-level image details, while largely ignoring… Expand
PixColor: Pixel Recursive Colorization
This work proposes a novel approach to automatically produce multiple colorized versions of a grayscale image, and demonstrates that this approach produces more diverse and plausible colorizations than existing methods, as judged by human raters in a "Visual Turing Test". Expand
Deep Learning-Based Video Coding
In the hope of advocating the research of deep learning-based video coding, a case study of the developed prototype video codec, Deep Learning Video Coding (DLVC), which features two deep tools that are both based on convolutional neural network, namely CNN-based in-loop filter and CNN- based block adaptive resolution coding. Expand
Qumran Letter Restoration by Rotation and Reflection Modified PixelCNN
This work presents a method to complete broken letters in the Dead Sea Scrolls, which is based on PixelCNN++, and modify the original method to allow reconstructions in multiple directions. Expand


PixelVAE: A Latent Variable Model for Natural Images
Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and model global structure well but have difficultyExpand
Pixel Recurrent Neural Networks
A deep neural network is presented that sequentially predicts the pixels in an image along the two spatial dimensions and encodes the complete set of dependencies in the image to achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Expand
Conditional Image Generation with PixelCNN Decoders
The gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-of-the-art performance of PixelRNN on ImageNet, with greatly reduced computational cost. Expand
Fields of Experts: a framework for learning image priors
  • S. Roth, Michael J. Black
  • Computer Science
  • 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
  • 2005
A framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks, developed using a Products-of-Experts framework. Expand
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
A generative parametric model capable of producing high quality samples of natural images using a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion. Expand
Factoring Variations in Natural Images with Deep Gaussian Mixture Models
This paper proposes a new scalable deep generative model for images, called the Deep Gaussian Mixture Model, that is a straightforward but powerful generalization of GMMs to multiple layers, and shows that deeper GMM architectures generalize better than more shallow ones. Expand
An Architecture for Deep, Hierarchical Generative Models
We present an architecture which lets us train deep, directed generative models with many layers of latent variables. We include deterministic paths between all latent variables and the generatedExpand
Pixel Recursive Super Resolution
This work proposes a new probabilistic deep network architecture, a pixel recursive super resolution model, that is an extension of PixelCNNs to address the problem of artificially enlarging a low resolution photograph to recover a plausible high resolution version. Expand
From learning models of natural image patches to whole image restoration
A generic framework which allows for whole image restoration using any patch based prior for which a MAP (or approximate MAP) estimate can be calculated is proposed and a generic, surprisingly simple Gaussian Mixture prior is presented, learned from a set of natural images. Expand
Density estimation using Real NVP
This work extends the space of probabilistic models using real-valued non-volume preserving (real NVP) transformations, a set of powerful invertible and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact sampling, exact inference of latent variables, and an interpretable latent space. Expand