• Corpus ID: 91175758

Towards GAN Benchmarks Which Require Generalization

@article{Gulrajani2019TowardsGB,
  title={Towards GAN Benchmarks Which Require Generalization},
  author={Ishaan Gulrajani and Colin Raffel and Luke Metz},
  journal={ArXiv},
  year={2019},
  volume={abs/2001.03653}
}
For many evaluation metrics commonly used as benchmarks for unconditional image generation, trivially memorizing the training set attains a better score than models which are considered state-of-the-art; we consider this problematic. We clarify a necessary condition for an evaluation metric not to behave this way: estimating the function must require a large sample from the model. In search of such a metric, we turn to neural network divergences (NNDs), which are defined in terms of a neural… 
Toward a Generalization Metric for Deep Generative Models
TLDR
Experimental results show that the NND metric can effectively detect training set memorization and distinguish GLVMs of different generalization capacities, and an efficient method for estimating the complexity of Generative Latent Variable Models (GLVMs) is developed.
Detecting Overfitting of Deep Generative Networks via Latent Recovery
TLDR
This work shows how simple losses are highly effective at reconstructing images for deep generators and analyzing the statistics of reconstruction errors for training versus validation images shows that pure GAN models appear to generalize well, in contrast with those using hybrid adversarial losses, which are amongst the most widely applied generative methods.
Top-K Training of GANs: Improving Generators by Making Critics Less Critical
TLDR
A simple modification to the Generative Adversarial Network (GAN) training algorithm is introduced that materially improves results with no increase in computational cost: when updating the generator parameters, it is shown that this `top-k update' procedure is a generally applicable improvement.
On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
TLDR
To detect intentional memorization, the "Memorization-Informed Frechet Inception Distance" (MiFID) is proposed as a new memorization-aware metric and benchmark procedures are designed to ensure that winning submissions made genuine improvements in perceptual quality.
Generating Private Data Surrogates for Vision Related Tasks
TLDR
This work demonstrates how to construct surrogate datasets, using images from GAN generators, labelled with a classifier trained on the private dataset, and shows this surrogate data can be used for a variety of downstream tasks, while being resistant to membership attacks.
Are conditional GANs explicitly conditional?
TLDR
This paper proposes two important contributions for conditional Generative Adversarial Networks (cGANs), an analysis of cGANs to show that they are not explicitly conditional and a new method that explicitly models conditionality for both parts of the adversarial architecture via a novel a contrario loss that involves training the discriminator to learn unconditional (adverse) examples.
Regularizing Generative Adversarial Networks under Limited Data
TLDR
This work proposes a regularization approach for training robust GAN models on limited data and theoretically shows a connection between the regularized loss and an f-divergence called LeCam-Divergence, which is more robust under limited training data.
A Method for Evaluating the Capacity of Generative Adversarial Networks to Reproduce High-order Spatial Context
TLDR
It is found that ensembles of generated images can appear accurate visually, and correspond to low Frèchet Inception Distance, while not exhibiting the known spatial arrangements, and while low-order ensemble statistics are largely correct, there are numerous quantifiable errors per image that plausibly can affect subsequent use of the GAN-generated images.
GenCo: Generative Co-training on Data-Limited Image Generation
TLDR
This work designs GenCo, a Generative Co-training network that mitigates the discriminator over-fitting issue by introducing multiple complementary discriminators that provide diverse supervision from multiple distinctive views in training.
Understanding Overparameterization in Generative Adversarial Networks
TLDR
This work theoretically shows that in an overparameterized GAN model with a 1-layer neural network generator and a linear discriminator, GDA converges to a global saddle point of the underlying non-convex concave min-max problem, the first result for global convergence of GDA in such settings.
...
1
2
3
4
...

References

SHOWING 1-10 OF 51 REFERENCES
Quantitatively Evaluating GANs With Divergences Proposed for Training
TLDR
This paper evaluates the performance of various types of GANs using divergence and distance functions typically used only for training, and compares the proposed metrics to human perceptual scores.
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
TLDR
This work creates a unified reimplemention and evaluation platform of various widely-used SSL techniques and finds that the performance of simple baselines which do not use unlabeled data is often underreported, that SSL methods differ in sensitivity to the amount of labeled and unlabeling data, and that performance can degrade substantially when the unlabelED dataset contains out-of-class examples.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy
TLDR
This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples.
A note on the evaluation of generative models
TLDR
This article reviews mostly known but often underappreciated properties relating to the evaluation and interpretation of generative models with a focus on image models and shows that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional.
Progressive Growing of GANs for Improved Quality, Stability, and Variation
TLDR
A new training methodology for generative adversarial networks is described, starting from a low resolution, and adding new layers that model increasingly fine details as training progresses, allowing for images of unprecedented quality.
GOOD TASK LOSSES FOR GENERATIVE MODELING
Generative modeling of high dimensional data like images is a notoriously difficult and ill-defined problem. In particular, how to evaluate a learned generative model is unclear. In this paper, we
Improved Training of Wasserstein GANs
TLDR
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.
A Note on the Inception Score
TLDR
New insights are provided into the Inception Score, a recently proposed and widely used evaluation metric for generative models, and it is demonstrated that it fails to provide useful guidance when comparing models.
Learning from Simulated and Unsupervised Images through Adversarial Training
TLDR
This work develops a method for S+U learning that uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors, and makes several key modifications to the standard GAN algorithm to preserve annotations, avoid artifacts, and stabilize training.
...
1
2
3
4
5
...