• Publications
  • Influence
Adam: A Method for Stochastic Optimization
TLDR
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. Expand
  • 56,546
  • 9917
  • PDF
Auto-Encoding Variational Bayes
TLDR
We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Expand
  • 10,477
  • 2359
  • PDF
Semi-supervised Learning with Deep Generative Models
TLDR
We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Expand
  • 1,583
  • 221
  • PDF
Glow: Generative Flow with Invertible 1x1 Convolutions
TLDR
In this paper we propose Glow, a simple type of generative model optimized towards the plain log-likelihood objective is capable of efficient realistic-looking synthesis and manipulation of large images. Expand
  • 827
  • 207
  • PDF
Improved Variational Inference with Inverse Autoregressive Flow
TLDR
We propose a new type of normalizing flow, inverse autoregressive flow (IAF), that, in contrast to earlier published flows, scales well to high-dimensional latent spaces. Expand
  • 839
  • 140
  • PDF
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
TLDR
We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. Expand
  • 935
  • 85
  • PDF
Variational Lossy Autoencoder
TLDR
In this paper, we present a simple but principled method to learn global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Expand
  • 381
  • 74
  • PDF
PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications
TLDR
We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. Expand
  • 383
  • 73
  • PDF
Learning Sparse Neural Networks through L0 Regularization
TLDR
We propose a practical method for $L_0$ norm regularization for neural networks: pruning the network during training by encouraging weights to become exactly zero. Expand
  • 363
  • 62
  • PDF
Stochastic Gradient VB and the Variational Auto-Encoder
TLDR
We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Expand
  • 164
  • 36
  • PDF