• Corpus ID: 219955663

Denoising Diffusion Probabilistic Models

  title={Denoising Diffusion Probabilistic Models},
  author={Jonathan Ho and Ajay Jain and P. Abbeel},
We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a… 

Improved Denoising Diffusion Probabilistic Models

This work shows that with a few simple modifications, DDPMs can also achieve competitive log-likelihoods while maintaining high sample quality, and finds that learning variances of the reverse diffusion process allows sampling with an order of magnitude fewer forward passes with a negligible difference in sample quality.

Denoising Diffusion Implicit Models

Denoising diffusion implicit models (DDIMs) are presented, a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs that can produce high quality samples faster and perform semantically meaningful image interpolation directly in the latent space.

Noise Estimation for Generative Diffusion Models

This work presents a simple and versatile learning scheme that can step-by-step adjust those noise parameters, for any given number of steps, while the previous work needs to retune for each number separately.

Stochastic Image Denoising by Sampling from the Posterior Distribution

This work proposes a novel stochastic denoising approach that produces viable and high perceptual quality results, while maintaining a small MSE, and presents an extension of the algorithm for handling the inpainting problem, recovering missing pixels while removing noise from partially given data.

Improved Autoregressive Modeling with Distribution Smoothing

This work incorporates randomized smoothing into autoregressive generative modeling, first model a smoothed version of the data distribution, and then reverse the smoothing process to recover the original data distribution.

Learning Energy-Based Models by Diffusion Recovery Likelihood

This work presents a diffusion recovery likelihood method to tractably learn and sample from a sequence of EBMs trained on increasingly noisy versions of a dataset, and demonstrates that unlike previous work on EBMs, long-run MCMC samples from the conditional distributions do not diverge and still represent realistic images, allowing us to accurately estimate the normalized density of data even for high-dimensional datasets.

Image Super-Resolution via Iterative Refinement

The effectiveness of SR3 is shown in cascaded image generation, where a generative model is chained with super-resolution models to synthesize high-resolution images with competitive FID scores on the class-conditional 256×256 ImageNet generation challenge.

Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed

A novel connection between knowledge distillation and image generation is established with a technique that distills a multi-step denoising process into a single step, resulting in a sampling speed similar to other single-step generative models.

Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting

TimeGrad, an autoregressive model for multivariate probabilistic time series forecasting which samples from the data distribution at each time step by estimating its gradient, is proposed.

Locally Masked Convolution for Autoregressive Models

LMConv is introduced: a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image, achieving improved performance on whole-image density estimation and globally coherent image completions.



FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models

This paper uses Hutchinson's trace estimator to give a scalable unbiased estimate of the log-density and demonstrates the approach on high-dimensional density estimation, image generation, and variational inference, achieving the state-of-the-art among exact likelihood methods with efficient sampling.

Generalizing Hamiltonian Monte Carlo with Neural Networks

This work presents a general-purpose method to train Markov chain Monte Carlo kernels, parameterized by deep neural networks, that converge and mix quickly to their target distribution, and releases an open source TensorFlow implementation.

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design

Flow++ is proposed, a new flow-based model that is now the state-of-the-art non-autoregressive model for unconditional density estimation on standard image benchmarks, and has begun to close the significant performance gap that has so far existed between autoregressive models and flow- based models.

Pixel Recurrent Neural Networks

A deep neural network is presented that sequentially predicts the pixels in an image along the two spatial dimensions and encodes the complete set of dependencies in the image to achieve log-likelihood scores on natural images that are considerably better than the previous state of the art.

Generative Modeling by Estimating Gradients of the Data Distribution

A new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching, which allows flexible model architectures, requires no sampling during training or the use of adversarial methods, and provides a learning objective that can be used for principled model comparisons.

Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling

The Subscale Pixel Network (SPN) is proposed, a conditional decoder architecture that generates an image as a sequence of sub-images of equal size that compactly captures image-wide spatial dependencies and requires a fraction of the memory and the computation required by other fully autoregressive models.

Improved Variational Inference with Inverse Autoregressive Flow

A new type of normalizing flow, inverse autoregressive flow (IAF), is proposed that, in contrast to earlier published flows, scales well to high-dimensional latent spaces and significantly improves upon diagonal Gaussian approximate posteriors.

Generating Diverse High-Fidelity Images with VQ-VAE-2

It is demonstrated that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

Large Scale GAN Training for High Fidelity Natural Image Synthesis

It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.