Corpus ID: 209318141

Evaluating Lossy Compression Rates of Deep Generative Models

  title={Evaluating Lossy Compression Rates of Deep Generative Models},
  author={Sicong Huang and Alireza Makhzani and Yanshuai Cao and Roger B. Grosse},
The field of deep generative modeling has succeeded in producing astonishingly realistic-seeming images and audio, but quantitative evaluation remains a challenge. Log-likelihood is an appealing metric due to its grounding in statistics and information theory, but it can be challenging to estimate for implicit generative models, and scalar-valued metrics give an incomplete picture of a model's quality. In this work, we propose to use rate distortion (RD) curves to evaluate and compare deep… Expand
Quantitative Understanding of VAE by Interpreting ELBO as Rate Distortion Cost of Transform Coding
It is shown theoretically and experimentally that VAE can be mapped to an implicit isometric embedding with a scale factor derived from the posterior parameter that can estimate the data probabilities in the input space from the prior, loss metrics, and corresponding posterior parameters. Expand
α-VAEs : Optimising variational inference by learning data-dependent divergence skew
  • 2021
The skew-geometric Jensen-Shannon divergence ( JSα ) allows for an intuitive interpolation between forward and reverse Kullback-Leibler (KL) divergence based on the skew parameter α. While theExpand
Universal Rate-Distortion-Perception Representations for Lossy Compression
It is proved that the corresponding information-theoretic universal rate-distortionperception function is operationally achievable in an approximate sense and motivates the study of practical constructions that are approximately universal across the RDP tradeoff, thereby alleviating the need to design a new encoder for each objective. Expand
The rate-distortion function R(D) tells us the minimal number of bits on average to compress a random object within a given distortion tolerance. A lower bound on R(D) therefore represents aExpand
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference
An exponential family interpretation of the geometric mixture curve underlying the TVO and various path sampling methods is proposed, which allows the gap in TVO likelihood bounds as a sum of KL divergences and derives a doubly reparameterized gradient estimator which improves model learning and allows the TVo to benefit from more refined bounds. Expand
Likelihood Ratio Exponential Families
This work extends likelihood ratio exponential families to include solutions to ratedistortion (RD) optimization, the Information Bottleneck (IB) method, and recent rate-distortion-classification approaches which combine RD and IB, which provides a common mathematical framework for understanding these methods via the conjugate duality of exponential families and hypothesis testing. Expand
Annealed Flow Transport Monte Carlo
A novel Monte Carlo algorithm that builds upon AIS and SMC and combines them with normalizing flows (NFs) for improved performance is proposed and a continuous-time scaling limit of the population version of AFT is given by a Feynman–Kac measure. Expand
Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding
It is clarified theoretically and experimentally that VAE can be mapped to an implicit isometric embedding with a scale factor derived from the posterior parameter, and the quantitative importance of each latent variable can be evaluated like the eigenvalue of PCA. Expand
Denoising Diffusion Probabilistic Models
High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. Expand
Rate-Regularization and Generalization in VAEs
It is shown that generalization performance continues to improve even after the mutual information saturates, indicating that the gap on the bound affects generalization, suggesting that the standard spherical Gaussian prior is not an inductive bias that typically improves generalization. Expand


On the Quantitative Analysis of Decoder-Based Generative Models
This work proposes to use Annealed Importance Sampling for evaluating log-likelihoods for decoder-based models and validate its accuracy using bidirectional Monte Carlo, and analyzes the performance of decoded models, the effectiveness of existing log- likelihood estimators, the degree of overfitting, and the degree to which these models miss important modes of the data distribution. Expand
Assessing Generative Models via Precision and Recall
A novel definition of precision and recall for distributions which disentangles the divergence into two separate dimensions is proposed which is intuitive, retains desirable properties, and naturally leads to an efficient algorithm that can be used to evaluate generative models. Expand
Practical Lossless Compression with Latent Variables using Bits Back Coding
Bits Back with ANS (BB-ANS) is presented, a scheme to perform lossless compression with latent variable models at a near optimal rate and it is concluded that with a sufficiently high quality generative model this scheme could be used to achieve substantial improvements in compression rate with acceptable running time. Expand
An empirical study on evaluation metrics of generative adversarial networks
This paper comprehensively investigates existing sample-based evaluation metrics for GANs and observes that kernel Maximum Mean Discrepancy and the 1-Nearest-Neighbor (1-NN) two-sample test seem to satisfy most of the desirable properties, provided that the distances between samples are computed in a suitable feature space. Expand
Large Scale GAN Training for High Fidelity Natural Image Synthesis
It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Expand
Improved Techniques for Training GANs
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes. Expand
A note on the evaluation of generative models
This article reviews mostly known but often underappreciated properties relating to the evaluation and interpretation of generative models with a focus on image models and shows that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional. Expand
Are GANs Created Equal? A Large-Scale Study
A neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures finds that most models can reach similar scores with enough hyperparameter optimization and random restarts, suggesting that improvements can arise from a higher computational budget and tuning more than fundamental algorithmic changes. Expand
Variational image compression with a scale hyperprior
It is demonstrated that this model leads to state-of-the-art image compression when measuring visual quality using the popular MS-SSIM index, and yields rate-distortion performance surpassing published ANN-based methods when evaluated using a more traditional metric based on squared error (PSNR). Expand
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
This work proposes a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions and introduces the "Frechet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score. Expand