Corpus ID: 91175758

Towards GAN Benchmarks Which Require Generalization

@article{Gulrajani2019TowardsGB,
  title={Towards GAN Benchmarks Which Require Generalization},
  author={Ishaan Gulrajani and Colin Raffel and Luke Metz},
  journal={ArXiv},
  year={2019},
  volume={abs/2001.03653}
}
For many evaluation metrics commonly used as benchmarks for unconditional image generation, trivially memorizing the training set attains a better score than models which are considered state-of-the-art; we consider this problematic. We clarify a necessary condition for an evaluation metric not to behave this way: estimating the function must require a large sample from the model. In search of such a metric, we turn to neural network divergences (NNDs), which are defined in terms of a neural… Expand
Toward a Generalization Metric for Deep Generative Models
TLDR
Experimental results show that the NND metric can effectively detect training set memorization and distinguish GLVMs of different generalization capacities, and an efficient method for estimating the complexity of Generative Latent Variable Models (GLVMs) is developed. Expand
Detecting Overfitting of Deep Generative Networks via Latent Recovery
TLDR
This work shows how simple losses are highly effective at reconstructing images for deep generators and analyzing the statistics of reconstruction errors for training versus validation images shows that pure GAN models appear to generalize well, in contrast with those using hybrid adversarial losses, which are amongst the most widely applied generative methods. Expand
Top-K Training of GANs: Improving Generators by Making Critics Less Critical
TLDR
A simple modification to the Generative Adversarial Network (GAN) training algorithm is introduced that materially improves results with no increase in computational cost: when updating the generator parameters, it is shown that this `top-k update' procedure is a generally applicable improvement. Expand
On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
TLDR
To detect intentional memorization, the "Memorization-Informed Frechet Inception Distance" (MiFID) is proposed as a new memorization-aware metric and benchmark procedures are designed to ensure that winning submissions made genuine improvements in perceptual quality. Expand
On Memorization in Probabilistic Deep Generative Models
TLDR
This work extends a recently proposed measure of memorization for supervised learning to the unsupervised density estimation problem and adapts it to be more computationally efficient and provides a framework for understanding problematic memorization in probabilistic generative models. Expand
Generating Private Data Surrogates for Vision Related Tasks
TLDR
This work demonstrates how to construct surrogate datasets, using images from GAN generators, labelled with a classifier trained on the private dataset, and shows this surrogate data can be used for a variety of downstream tasks, while being resistant to membership attacks. Expand
Are conditional GANs explicitly conditional?
TLDR
This paper proposes two important contributions for conditional Generative Adversarial Networks (cGANs), an analysis of cGANs to show that they are not explicitly conditional and a new method that explicitly models conditionality for both parts of the adversarial architecture via a novel a contrario loss that involves training the discriminator to learn unconditional (adverse) examples. Expand
Regularizing Generative Adversarial Networks under Limited Data
TLDR
This work proposes a regularization approach for training robust GAN models on limited data and theoretically shows a connection between the regularized loss and an f-divergence called LeCam-Divergence, which is more robust under limited training data. Expand
A Method for Evaluating the Capacity of Generative Adversarial Networks to Reproduce High-order Spatial Context
TLDR
It is found that ensembles of generated images can appear accurate visually, and correspond to low Frèchet Inception Distance, while not exhibiting the known spatial arrangements, and while low-order ensemble statistics are largely correct, there are numerous quantifiable errors per image that plausibly can affect subsequent use of the GAN-generated images. Expand
GenCo: Generative Co-training on Data-Limited Image Generation
TLDR
This work designs GenCo, a Generative Co-training network that mitigates the discriminator over-fitting issue by introducing multiple complementary discriminators that provide diverse supervision from multiple distinctive views in training. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 51 REFERENCES
Quantitatively Evaluating GANs With Divergences Proposed for Training
TLDR
This paper evaluates the performance of various types of GANs using divergence and distance functions typically used only for training, and compares the proposed metrics to human perceptual scores. Expand
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes. Expand
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
TLDR
This work creates a unified reimplemention and evaluation platform of various widely-used SSL techniques and finds that the performance of simple baselines which do not use unlabeled data is often underreported, that SSL methods differ in sensitivity to the amount of labeled and unlabeling data, and that performance can degrade substantially when the unlabelED dataset contains out-of-class examples. Expand
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy
TLDR
This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. Expand
A note on the evaluation of generative models
TLDR
This article reviews mostly known but often underappreciated properties relating to the evaluation and interpretation of generative models with a focus on image models and shows that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional. Expand
Progressive Growing of GANs for Improved Quality, Stability, and Variation
TLDR
A new training methodology for generative adversarial networks is described, starting from a low resolution, and adding new layers that model increasingly fine details as training progresses, allowing for images of unprecedented quality. Expand
GOOD TASK LOSSES FOR GENERATIVE MODELING
Generative modeling of high dimensional data like images is a notoriously difficult and ill-defined problem. In particular, how to evaluate a learned generative model is unclear. In this paper, weExpand
Improved Training of Wasserstein GANs
TLDR
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning. Expand
A Note on the Inception Score
TLDR
New insights are provided into the Inception Score, a recently proposed and widely used evaluation metric for generative models, and it is demonstrated that it fails to provide useful guidance when comparing models. Expand
Learning from Simulated and Unsupervised Images through Adversarial Training
TLDR
This work develops a method for S+U learning that uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors, and makes several key modifications to the standard GAN algorithm to preserve annotations, avoid artifacts, and stabilize training. Expand
...
1
2
3
4
5
...