An Introduction to Variational Autoencoders

@article{Kingma2019AnIT,
  title={An Introduction to Variational Autoencoders},
  author={Diederik P. Kingma and Max Welling},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.02691}
}
Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models. In this work, we provide an introduction to variational autoencoders and some important extensions. 
On Disentanglement and Mutual Information in Semi-Supervised Variational Auto-Encoders
TLDR
This abstract considers the semisupervised setting, in which the factors of variation are labelled for a small fraction of the authors' samples, and examines how the quality of learned representations is affected by the dimension of the unsupervised component of the latent space.
The Theoretical Breakthrough of Self-Supervised Learning : Variational Autoencoders and Its Application In Big Data Analysis
TLDR
The basic principle of variational autoencoders is introduced and its application under the background of big data is analyzed, as well as the problems in theory and application.
Self-Reflective Variational Autoencoder
TLDR
This work redesigns the hierarchical structure of existing VAE architectures, self-reflection ensures that the stochastic flow preserves the factorization of the exact posterior, sequentially updating the latent codes in a recurrent manner consistent with the generative model.
An Overview of Variational Autoencoders for Source Separation, Finance, and Bio-Signal Applications
TLDR
Applications of variational autoencoders for finance, speech/audio source separation, and biosignal applications are presented, and possible areas of research in improving performance of VAEs in particular and deep generative models in general are identified.
A Comparison of Deep Learning Architectures for the 3D Generation Data
TLDR
Twelve models with different hyperparameters were created to compare several networks with the generative architectures Autoencoder, Variational AutoenCoder, and Generative Adversarial Networks in the 3D MNIST dataset, indicating the Autoen coder models as the best cost-benefit models.
Exploring the Latent Space of Autoencoders with Interventional Assays
TLDR
A framework, called latent responses, is proposed, which exploits the locally contractive behavior exhibited by variational autoencoders to explore the learned manifold and extends the notion of disentanglement to take the learned generative process into account and consequently avoid the limitations of existing metrics that may rely on spurious correlations.
Deep Learning Architectures Comparison for the 3D Generation Data
TLDR
Twelve models with different hyperparameters were created to compare several networks with the generative architectures Autoencoder, Variational AutoenCoder, and Generative Adversarial Networks in the 3D MNIST dataset, indicating the Autoen coder models as the best cost-benefit models.
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
TLDR
This paper proposes a variational autoencoder with disentanglement priors for conditional natural language generation with none or a handful of task-specific labeled examples, and shows both empirically and theoretically that the conditional priors can already disentangle representations even without specific regularizations as in the prior work.
Analysis of ODE2VAE with Examples
TLDR
This paper analyzes the latent representations inferred by the ODE2VAE model over three different physical motion datasets and shows that the model is able to learn meaningful latent representations to an extent without any supervision.
Probabilistic Autoencoder Using Fisher Information
TLDR
In this work, an extension to the autoencoder architecture is introduced, the FisherNet, which has advantages from a theoretical point of view as it provides a direct uncertainty quantification derived from the model and also accounts for uncertainty cross-correlations.
...
...

References

SHOWING 1-10 OF 148 REFERENCES
Towards Conceptual Compression
TLDR
A simple recurrent variational auto-encoder architecture that significantly improves image modeling and shows that it naturally separates global conceptual information from lower level details, thus addressing one of the fundamentally desired properties of unsupervised learning.
Variational Recurrent Auto-Encoders
TLDR
A model that combines the strengths of RNNs and SGVB: the Variational Recurrent Auto-Encoder (VRAE) is proposed, which can be used for efficient, large scale unsupervised learning on time series data, mapping the time seriesData to a latent vector representation.
Auxiliary Deep Generative Models
TLDR
This work extends deep generative models with auxiliary variables which improves the variational approximation and proposes a model with two stochastic layers and skip connections which shows state-of-the-art performance within semi-supervised learning on MNIST, SVHN and NORB datasets.
Deep AutoRegressive Networks
TLDR
An efficient approximate parameter estimation method based on the minimum description length (MDL) principle is derived, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference.
Reinterpreting Importance-Weighted Autoencoders
TLDR
An alternate interpretation of importance-weighted autoencoders is given: that it optimizes the standard variational lower bound, but using a more complex distribution.
A Structured Variational Auto-encoder for Learning Deep Hierarchies of Sparse Features
TLDR
A generative model of natural images consisting of a deep hierarchy of layers of latent random variables, each of which follows a new type of distribution that the authors call rectified Gaussian, allows spike-and-slab type sparsity, while retaining the differentiability necessary for efficient stochastic gradient variational inference.
Ladder Variational Autoencoders
TLDR
A new inference model is proposed, the Ladder Variational Autoencoder, that recursively corrects the generative distribution by a data dependent approximate likelihood in a process resembling the recently proposed Ladder Network.
Isolating Sources of Disentanglement in Variational Autoencoders
We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables. We use this to motivate our $\beta$-TCVAE (Total Correlation
Learning Stochastic Recurrent Networks
TLDR
The proposed model is a generalisation of deterministic recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs), and is evaluated on four polyphonic musical data sets and motion capture data.
A Hybrid Convolutional Variational Autoencoder for Text Generation
TLDR
A novel hybrid architecture that blends fully feed-forward convolutional and deconvolutional components with a recurrent language model is proposed that helps to avoid the issue of the VAE collapsing to a deterministic model.
...
...