Improving VAE-based Representation Learning

  title={Improving VAE-based Representation Learning},
  author={Mingtian Zhang and Tim Z. Xiao and Brooks Paige and David Barber},
Latent variable models like the Variational Auto-Encoder (VAE) are commonly used to learn representations of images. However, for downstream tasks like semantic classification, the representations learned by VAE are less competitive than other non-latent variable models. This has led to some speculations that latent variable models may be fundamentally unsuitable for representation learning. In this work, we study what properties are required for good representations and how different VAE… 

Figures and Tables from this paper

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

This work introduces CLUTR: a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization that outperforms PAIRED in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.

Parallel Neural Local Lossless Compression

This paper proposes two parallelization schemes for local autoregressive models and provides experimental evidence of gains in compression runtime compared to the previous, non-parallel implementation.

Generalization Gap in Amortized Inference

This work proposes a new training objective, inspired by the classic wake-sleep algorithm, to improve the generalizations properties of amortized inference and demonstrates how it can improve generalization performance in the context of image modeling and lossless compression.



Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features

Two methods are proposed, first, using the log likelihood ratios of two identical models, one trained on the in-distribution data and the other on a more general distribution of images, which achieve strong anomaly detection performance in the unsupervised setting, reaching comparable performance as state-of-the-art classifier-based methods in the supervised setting.

PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications

This work discusses the implementation of PixelCNNs, a recently proposed class of powerful generative models with tractable likelihood that contains a number of modifications to the original model that both simplify its structure and improve its performance.

PixelVAE: A Latent Variable Model for Natural Images

Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and model global structure well but have difficulty

On the Out-of-distribution Generalization of Probabilistic Image Modelling

This work proposes a Local Autoregressive model that exclusively models local image features towards improving OOD performance and employs the model to build a new lossless image compressor: NeLLoC (Neural Local Lossless Compressor) and report state-of-the-art compression rates and model size.

Learning deep representations by mutual information estimation and maximization

It is shown that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation’s suitability for downstream tasks and is an important step towards flexible formulations of representation learning objectives for specific end-goals.

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

Semi-supervised Learning with Deep Generative Models

It is shown that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.

Reading Digits in Natural Images with Unsupervised Feature Learning

A new benchmark dataset for research use is introduced containing over 600,000 labeled digits cropped from Street View images, and variants of two recently proposed unsupervised feature learning methods are employed, finding that they are convincingly superior on benchmarks.

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.