• Corpus ID: 235390993

Score-based Generative Modeling in Latent Space

@inproceedings{Vahdat2021ScorebasedGM,
  title={Score-based Generative Modeling in Latent Space},
  author={Arash Vahdat and Karsten Kreis and Jan Kautz},
  booktitle={NeurIPS},
  year={2021}
}
Score-based generative models (SGMs) have recently demonstrated impressive results in terms of both sample quality and distribution coverage. However, they are usually applied directly in data space and often require thousands of network evaluations for sampling. Here, we propose the Latent Score-based Generative Model (LSGM), a novel approach that trains SGMs in a latent space, relying on the variational autoencoder framework. Moving from data to latent space allows us to train more expressive… 
Controllable and Compositional Generation with Latent-Space Energy-Based Models
TLDR
This work uses energybased models (EBMs) to handle compositional generation over a set of attributes and is the first to achieve such compositionality in generating photo-realistic images of resolution 1024⇥1024.
Score-Based Generative Models Detect Manifolds
TLDR
This analysis provides precise conditions under which SGMs are able to produce samples from an underlying (low-dimensional) data manifold M, and provides a precise description of when the SGM memorizes its training data.
Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
TLDR
A novel critically-damped Langevin diffusion (CLD) is proposed and it is shown that CLD outperforms previous SGMs in synthesis quality for similar network architectures and sampling compute budgets, and that the novel sampler for CLD significantly outperforms solvers such as Euler–Maruyama.
Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling
TLDR
This work views the diffusion sampling process as a Metropolis adjusted Langevin algorithm, which helps reveal the underlying cause to be ill-conditioned curvature and proposes a model-agnostic preconditioned dif-fusion sampling (PDS) method that leverages matrix preconditionsing to alleviate the aforementioned problem.
Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems
TLDR
The framework, SGILO, extends prior work by replacing the sparsity regularization with a generative prior in the intermediate layer by training a score-based model in the latent space of a StyleGAN-2 and using it to solve inverse problems.
DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents
TLDR
The resulting model can improve upon the unconditional diffusion model in terms of sampling efficiency while also equipping diffusion models with the low-dimensional VAE inferred latent code and exhibits synthesis quality comparable to state-of-the-art models on standard benchmarks.
SPI-GAN: Distilling Score-based Generative Models with Straight-Path Interpolations
TLDR
An enhanced distillation method, called straight-path interpolation GAN (SPI-GAN), which can be compared to the state-of-the-art shortcut-based distillation methods, and is one of the best models in terms of the sampling quality/diversity/time for CIFAR-10, CelebA-HQ-256, and LSUN-Church-256.
The deep generative decoder: Using MAP estimates of representations
TLDR
This work argues that it is worthwhile to investigate a much simpler approximation which finds representations and their distribution by maximizing the model likelihood via back-propagation, and calls it a Deep Generative Decoder (DGD).
Few-Shot Diffusion Models
TLDR
Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs, and how conditioning the model on patch-based input set information improves training convergence is shown.
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
TLDR
Diffusion-Decoding models with Contrastive representations (D2C), a paradigm for training unconditional variational autoencoders (VAEs) for few-shot conditional image generation and contrastive self-supervised learning to improve representation quality is described.
...
...

References

SHOWING 1-10 OF 111 REFERENCES
Controllable and Compositional Generation with Latent-Space Energy-Based Models
TLDR
This work uses energybased models (EBMs) to handle compositional generation over a set of attributes and is the first to achieve such compositionality in generating photo-realistic images of resolution 1024⇥1024.
BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling
TLDR
This paper introduces the Bidirectional-Inference Variational Autoencoder (BIVA), characterized by a skip-connected generative model and an inference network formed by a bidirectional stochastic inference path, and shows that BIVA reaches state-of-the-art test likelihoods, generates sharp and coherent natural images, and uses the hierarchy of latent variables to capture different aspects of the data distribution.
MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
Variational Autoencoder (VAE), a simple and effective deep generative model, has led to a number of impressive empirical successes and spawned many advanced variants and theoretical investigations.
Importance Weighted Autoencoders
TLDR
The importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting, shows empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log- likelihood on density estimation benchmarks.
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
TLDR
This paper proposes the first large-scale language VAE model, Optimus, a universal latent embedding space for sentences that is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks.
From Variational to Deterministic Autoencoders
TLDR
It is shown, in a rigorous empirical study, that the proposed regularized deterministic autoencoders are able to generate samples that are comparable to, or better than, those of VAEs and more powerful alternatives when applied to images as well as to structured data such as molecules.
Refining Deep Generative Models via Discriminator Gradient Flow
TLDR
Empirical results demonstrate that DGf low leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) anddiscriminator Driven Latent Sampling (DDLS) methods.
VAE with a VampPrior
TLDR
This paper proposes to extend the variational auto-encoder (VAE) framework with a new type of prior called "Variational Mixture of Posteriors" prior, or VampPrior for short, which consists of a mixture distribution with components given by variational posteriors conditioned on learnable pseudo-inputs.
Distribution Augmentation for Generative Modeling
TLDR
DistAug is presented, a simple and powerful method of regularizing generative models that applies augmentation functions to data and conditions the generative model on the specific function used, enabling aggressive augmentations more commonly seen in supervised and self-supervised learning.
Improved Techniques for Training Score-Based Generative Models
TLDR
This work provides a new theoretical analysis of learning and sampling from score models in high dimensional spaces, explaining existing failure modes and motivating new solutions that generalize across datasets.
...
...