• Corpus ID: 195767445

The Thermodynamic Variational Objective

  title={The Thermodynamic Variational Objective},
  author={Vaden Masrani and Tuan Anh Le and Frank D. Wood},
  booktitle={Neural Information Processing Systems},
We introduce the thermodynamic variational objective (TVO) for learning in both continuous and discrete deep generative models. The TVO arises from a key connection between variational inference and thermodynamic integration that results in a tighter lower bound to the log marginal likelihood than the standard variational variational evidence lower bound (ELBO) while remaining as broadly applicable. We provide a computationally efficient gradient estimator for the TVO that applies to continuous… 

Figures and Tables from this paper

All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference

An exponential family interpretation of the geometric mixture curve underlying the TVO and various path sampling methods is proposed, which allows the gap in TVO likelihood bounds as a sum of KL divergences and derives a doubly reparameterized gradient estimator which improves model learning and allows the TVo to benefit from more refined bounds.

Nested Variational Inference

NVI is developed, a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting, and it is observed that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.

Gaussian Process Bandit Optimization of theThermodynamic Variational Objective

This paper introduces a bespoke Gaussian process bandit optimization method that automates their one-time selection, but also dynamically adapts their positions over the course of optimization, leading to improved model learning and inference.

Variational Inference for Sequential Data with Future Likelihood Estimates

A novel vari­ ational inference algorithm for sequential data is presented, which performs well even when the density from the model is not differentiable, for instance, due to the use of discrete random variables.

GFlowNets and variational inference

This paper builds bridges between two families of probabilistic algorithms: (hi-erarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and

NVAE: A Deep Hierarchical Variational Autoencoder

NVAE is the first successful VAE applied to natural images as large as 256$\times$256 pixels and achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, CelebA 64, and CelebA HQ datasets and it provides a strong baseline on FFHQ.

Surrogate Likelihoods for Variational Annealed Importance Sampling

This work argues theoretically that the resulting algorithm allows an intuitive trade-off between inference and computational cost, and shows that it performs well in practice and is well-suited for black-box inference in probabilistic programming frameworks.

Controlling the Interaction Between Generation and Inference in Semi-Supervised Variational Autoencoders Using Importance Weighting

Using importance weighting and an analysis of the objective of semi-supervised VAEs, it is shown that they use the posterior of the learned generative model to guide the inference model in learning the partially observed latent variable.

Semi-deterministic and Contrastive Variational Graph Autoencoder for Recommendation

This paper proposes a novel Semi-deterministic and Contrastive Variational Graph autoencoder (SCVG) for item recommendation, and empirically shows that the contrastive regularization makes learned user/item latent representation more personalized and helps to smooth the training process.


This is the first work that proposes attention mechanisms to build more expressive variational distributions in deep probabilistic models by explicitly modeling both nearby and distant interactions in the latent space and achieves state-of-the-art log-likelihoods while using fewer latent layers and requiring less training time than existing models.



Unbiased Implicit Variational Inference

UIVI considers an implicit variational distribution obtained in a hierarchical manner using a simple reparameterizable distribution whose variational parameters are defined by arbitrarily flexible deep neural networks and directly optimizes the evidence lower bound (ELBO).

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

Variational Sequential Monte Carlo

The VSMC family is a variational family that can approximate the posterior arbitrarily well, while still allowing for efficient optimization of its parameters, and is demonstrated its utility on state space models, stochastic volatility models for financial data, and deep Markov models of brain neural circuits.

Variational Inference for Monte Carlo Objectives

The first unbiased gradient estimator designed for importance-sampled objectives is developed, which is both simpler and more effective than the NVIL estimator proposed for the single-sample variational objective, and is competitive with the currently used biases.

Neural Variational Inference and Learning in Belief Networks

This work proposes a fast non-iterative approximate inference method that uses a feedforward network to implement efficient exact sampling from the variational posterior and shows that it outperforms the wake-sleep algorithm on MNIST and achieves state-of-the-art results on the Reuters RCV1 document dataset.

Importance Weighted Autoencoders

The importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting, shows empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log- likelihood on density estimation benchmarks.

Tighter Variational Bounds are Not Necessarily Better

We provide theoretical and empirical evidence that using tighter evidence lower bounds (ELBOs) can be detrimental to the process of learning an inference network by reducing the signal-to-noise ratio

Filtering Variational Objectives

A family of lower bounds defined by a particle filter's estimator of the marginal likelihood, the filtering variational objectives (FIVOs), are considered, which take the same arguments as the ELBO, but can exploit a model's sequential structure to form tighter bounds.

Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives

A computationally efficient, unbiased drop-in gradient estimator that reduces the variance of the IWAE gradient, the reweighted wake-sleep update (RWS), and the jackknife variational inference (JVI) gradient (Nowozin, 2018).

Bidirectional Helmholtz Machines

A new model is proposed which guarantees that the top-down and bottom-up distributions can efficiently invert each other, and which results in state of the art generative models which prefer significantly deeper architectures while it allows for orders of magnitude more efficient approximate inference.