• Corpus ID: 209318141

Evaluating Lossy Compression Rates of Deep Generative Models

@inproceedings{Huang2020EvaluatingLC,
  title={Evaluating Lossy Compression Rates of Deep Generative Models},
  author={Sicong Huang and Alireza Makhzani and Yanshuai Cao and Roger Baker Grosse},
  booktitle={International Conference on Machine Learning},
  year={2020}
}
The field of deep generative modeling has succeeded in producing astonishingly realistic-seeming images and audio, but quantitative evaluation remains a challenge. Log-likelihood is an appealing metric due to its grounding in statistics and information theory, but it can be challenging to estimate for implicit generative models, and scalar-valued metrics give an incomplete picture of a model's quality. In this work, we propose to use rate distortion (RD) curves to evaluate and compare deep… 

Neural Estimation of the Rate-Distortion Function for Massive Datasets

This work reformulates the rate-distortion objective, and solves the resulting functional optimization problem using neural networks, and describes a method to implement an operational lossy compression scheme with guarantees on the achievable rate and distortion.

Neural Estimation of the Rate-Distortion Function With Applications to Operational Source Coding

This paper describes how NERD can be used to construct an operational one-shot lossy compression scheme with guarantees on the achievable rate and distortion, and investigates methods to estimate the rate-distortion function on large, real-world data.

Quantitative Understanding of VAE by Interpreting ELBO as Rate Distortion Cost of Transform Coding

It is shown theoretically and experimentally that VAE can be mapped to an implicit isometric embedding with a scale factor derived from the posterior parameter that can estimate the data probabilities in the input space from the prior, loss metrics, and corresponding posterior parameters.

Rate-Regularization and Generalization in VAEs

It is shown that generalization performance continues to improve even after the mutual information saturates, indicating that the gap on the bound affects generalization, suggesting that the standard spherical Gaussian prior is not an inductive bias that typically improves generalization.

Universal Rate-Distortion-Perception Representations for Lossy Compression

It is proved that the corresponding information-theoretic universal rate-distortion-perception function is operationally achievable in an approximate sense and motivates the study of practical constructions that are approximately universal across the RDP tradeoff, thereby alleviating the need to design a new encoder for each objective.

α-VAEs : Optimising variational inference by learning data-dependent divergence skew

The skew-geometric Jensen-Shannon divergence ( JSα ) allows for an intuitive interpolation between forward and reverse Kullback-Leibler (KL) divergence based on the skew parameter α. While the

RATE-DISTORTION FUNCTION

The first attempt at an algorithm for sandwiching the R-D function of a general (not necessarily discrete) source requiring only i.i.d. data samples is made, indicating theoretical room for improving state-of-the-art image compression methods by at least one dB in PSNR at various bitrates.

Towards Empirical Sandwich Bounds on the Rate-Distortion Function

The first attempt at an algorithm for sandwiching the R-D function of a general (not necessarily discrete) source requiring only i.i.d. data samples is made, indicating theoretical room for improving state-of-the-art image compression methods by at least one dB in PSNR at various bitrates.

Optimization of Annealed Importance Sampling Hyperparameters

This work presents a parameteric AIS process with parameter sharing between annealing distributions, the use of linear schedule for discretization and amortization of hyperparameter selection in latent variable models, and assesses the performance of Optimized-Path AIS for marginal likelihood estimation of deep generative models.

Denoising Diffusion Probabilistic Models

High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.

References

SHOWING 1-10 OF 48 REFERENCES

On the Quantitative Analysis of Decoder-Based Generative Models

This work proposes to use Annealed Importance Sampling for evaluating log-likelihoods for decoder-based models and validate its accuracy using bidirectional Monte Carlo, and analyzes the performance of decoded models, the effectiveness of existing log- likelihood estimators, the degree of overfitting, and the degree to which these models miss important modes of the data distribution.

Assessing Generative Models via Precision and Recall

A novel definition of precision and recall for distributions which disentangles the divergence into two separate dimensions is proposed which is intuitive, retains desirable properties, and naturally leads to an efficient algorithm that can be used to evaluate generative models.

Improving Inference for Neural Image Compression

This work identifies three approximation gaps which limit performance in the conventional approach to compression and proposes improvements to each based on ideas related to iterative inference, stochastic annealing for discrete optimization, and bits-back coding, resulting in the first application of bits- back coding to lossy compression.

Practical Lossless Compression with Latent Variables using Bits Back Coding

Bits Back with ANS (BB-ANS) is presented, a scheme to perform lossless compression with latent variable models at a near optimal rate and it is concluded that with a sufficiently high quality generative model this scheme could be used to achieve substantial improvements in compression rate with acceptable running time.

An empirical study on evaluation metrics of generative adversarial networks

This paper comprehensively investigates existing sample-based evaluation metrics for GANs and observes that kernel Maximum Mean Discrepancy and the 1-Nearest-Neighbor (1-NN) two-sample test seem to satisfy most of the desirable properties, provided that the distances between samples are computed in a suitable feature space.

Large Scale GAN Training for High Fidelity Natural Image Synthesis

It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.

Improved Techniques for Training GANs

This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.

A note on the evaluation of generative models

This article reviews mostly known but often underappreciated properties relating to the evaluation and interpretation of generative models with a focus on image models and shows that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional.

Are GANs Created Equal? A Large-Scale Study

A neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures finds that most models can reach similar scores with enough hyperparameter optimization and random restarts, suggesting that improvements can arise from a higher computational budget and tuning more than fundamental algorithmic changes.

Variational image compression with a scale hyperprior

It is demonstrated that this model leads to state-of-the-art image compression when measuring visual quality using the popular MS-SSIM index, and yields rate-distortion performance surpassing published ANN-based methods when evaluated using a more traditional metric based on squared error (PSNR).