Convergence of score-based generative modeling for general data distributions

@article{Lee2022ConvergenceOS,
  title={Convergence of score-based generative modeling for general data distributions},
  author={Holden Lee and Jianfeng Lu and Yixin Tan},
  journal={ArXiv},
  year={2022},
  volume={abs/2209.12381}
}
Score-based generative modeling (SGM) has grown to be a hugely successful method for learning to generate samples from complex data distributions such as that of images and audio. It is based on evolving an SDE that transforms white noise into a sample from the learned distribution, using estimates of the score function , or gradient log-pdf. Previous convergence analyses for these methods have suffered either from strong assumptions on the data distribution or exponential dependencies, and… 

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

It is shown that score-based generative models such as denoising diffusion probabilistic models (DDPMs) can efficiently sample from essentially any realistic data distribution, and theoretical convergence guarantees for these models hold for an L 2 -accurate score estimate.

Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions

Under an L 2 -accurate score estimator, convergence guarantees with polynomial complexity for any data distribution with second-order moment are provided, by either employing an early stopping technique or assuming smoothness condition on the score function of the data distribution.

Convergence in KL Divergence of the Inexact Langevin Algorithm with Application to Score-based Generative Models

The Inexact Langevin Algorithm for sampling using estimated score function when the target distribution satisfies log-Sobolev inequality (LSI) is studied, motivated by Score-based Generative Modeling (SGM), and a long-term convergence in Kullback-Leibler divergence is proved.

Statistical Efficiency of Score Matching: The View from Isoperimetry

This paper shows that the score matching estimator is statistically comparable to the maximum likelihood when the distribution has a small isoperimetric constant, and shows a direct parallel in the discrete setting, where it connects the statistical properties of pseudolikelihood estimation with approximate tensorization of entropy and the Glauber dynamics.

Fisher information lower bounds for sampling

We prove two lower bounds for the complexity of non-log-concave sampling within the framework of Balasubramanian et al. (2022), who introduced the use of Fisher information ( FI ) bounds as a notion

Thompson Sampling with Diffusion Generative Prior

This work focuses on the meta-learning for bandit framework, aiming to learn a strategy that performs well across bandit tasks of a same class, and trains a diffusion model that learns the underlying task distribution and combines Thompson sampling with the learned prior to deal with new tasks at test time.

References

SHOWING 1-10 OF 25 REFERENCES

Convergence for score-based generative modeling with polynomial complexity

This work proves the first polynomial convergence guarantees for the core mechanic behind SGM: drawing samples from a probability density p given a score estimate (an estimate of ∇ ln p) that is accurate in L(p) that works for any smooth distribution and depends polynomially on its log-Sobolev constant.

Generative Modeling by Estimating Gradients of the Data Distribution

A new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching, which allows flexible model architectures, requires no sampling during training or the use of adversarial methods, and provides a learning objective that can be used for principled model comparisons.

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

It is shown that score-based generative models such as denoising diffusion probabilistic models (DDPMs) can efficiently sample from essentially any realistic data distribution, and theoretical convergence guarantees for these models hold for an L 2 -accurate score estimate.

Score-Based Generative Modeling through Stochastic Differential Equations

This work presents a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by Slowly removing the noise.

Convergence of denoising diffusion models under the manifold hypothesis

This paper provides the first convergence results for diffusion models in this setting by providing quantitative bounds on the Wasserstein distance of order one between the target data distribution and the generative distribution of the diffusion model.

Improved Techniques for Training Score-Based Generative Models

This work provides a new theoretical analysis of learning and sampling from score models in high dimensional spaces, explaining existing failure modes and motivating new solutions that generalize across datasets.

Generative Modeling with Denoising Auto-Encoders and Langevin Sampling

It is shown that both DAE and DSM provide estimates of the score of the Gaussian smoothed population density, allowing the machinery of Empirical Processes to apply to the homotopy method of arXiv:1907.05600.

Subspace Diffusion Generative Models

This work restricts the diffusion via projections onto subspaces as the data distribution evolves toward noise, which simultaneously improves sample quality and reduces the computational cost of inference for the same number of denoising steps.

A Connection Between Score Matching and Denoising Autoencoders

A proper probabilistic model for the denoising autoencoder technique is defined, which makes it in principle possible to sample from them or rank examples by their energy, and a different way to apply score matching that is related to learning to denoise and does not require computing second derivatives is suggested.

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

This approach is the first to achieve performance rivaling the state-of-the-art in both generative and discriminative learning within one hybrid model.