• Corpus ID: 326772

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

  title={GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium},
  author={Martin Heusel and Hubert Ramsauer and Thomas Unterthiner and Bernhard Nessler and Sepp Hochreiter},
Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. [] Key Method Using the theory of stochastic approximation, we prove that the TTUR converges under mild assumptions to a stationary local Nash equilibrium. The convergence carries over to the popular Adam optimization, for which we prove that it follows the dynamics of a heavy ball with friction and thus prefers flat minima in the objective landscape.

On the Convergence and Robustness of Training GANs with Regularized Optimal Transport

This work shows that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective.

Conjugate Gradient Method for Generative Adversarial Networks

The conjugate gradient method is proposed to apply to solve the local Nash equilibrium problem in GANs and it is shown that the proposed method outperforms stochastic gradient descent (SGD) and momentum SGD.


It is proved that Coulomb GANs possess only one Nash equilibrium which is optimal in the sense that the model distribution equals the target distribution and is shown to be effective on LSUN bedrooms, CelebA faces, CIFAR-10 and the Google Billion Word text generation.

Cumulant GAN

A novel loss function for training generative adversarial networks (GANs) aiming toward deeper theoretical understanding as well as improved stability and performance for the underlying optimization problem is proposed.

JR-GAN: Jacobian Regularization for Generative Adversarial Networks

The Jacobian Regularized GANs (JR-GANs) are proposed, which insure the two factors of the Jacobian are alleviated by construction and achieves near state-of-the-art results both qualitatively and quantitatively.

Single-level Adversarial Data Synthesis based on Neural Tangent Kernels

A new generative model called the generative adversarial NTK (GA-NTK) that has a single-level objective and keeps the spirit of adversarial learning while avoiding the training difficulties of GANs is proposed.

Solving Approximate Wasserstein GANs to Stationarity

This work proposes to use a smooth approximation of the Wasserstein GANs and shows that this smooth approximation is close to the original objective, and proposes a class of algorithms with guaranteed theoretical convergence to stationarity.

Variational Bayesian GAN

This study presents a variational GAN (VGAN) where the encoder, generator and discriminator are jointly estimated according to the variational Bayesian inference and demonstrates the superiority of the proposed VGAN to the Variational autoencoder, the standard GAN and the Bayesian GAN based on the sampling method.

BEGAN v3: Avoiding Mode Collapse in GANs Using Variational Inference

The proposed model did not cause mode collapse but converged to a better state than BEGAN-CS, which was improved in terms of loss function, but did not solve the mode collapse.

First Order Generative Adversarial Networks

This work introduces a theoretical framework which allows the derivation of requirements on both the divergence and corresponding method for determining an update direction, and proposes a novel divergence which approximates the Wasserstein distance while regularizing the critic's first order information.



Gradient descent GAN optimization is locally stable

This paper analyzes the "gradient descent" form of GAN optimization i.e., the natural setting where the authors simultaneously take small gradient steps in both generator and discriminator parameters, and proposes an additional regularization term for gradient descent GAN updates that is able to guarantee local stability for both the WGAN and the traditional GAN.

An Online Learning Approach to Generative Adversarial Networks

A novel training method named Chekhov GAN is proposed and it is shown that this method provably converges to an equilibrium for semi-shallow GAN architectures, i.e. architectures where the discriminator is a one layer network and the generator is arbitrary.

Improved Training of Wasserstein GANs

This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.

Towards Understanding the Dynamics of Generative Adversarial Networks

This model and analysis point to a specific challenge in practical GAN training that is called discriminator collapse, and proposes a simple model that exhibits several of the common problematic convergence behaviors and still allows the first convergence bounds for parametric GAN dynamics.

Boundary-Seeking Generative Adversarial Networks

This work introduces a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.

Approximation and Convergence Properties of Generative Adversarial Learning

It is shown that if the objective function is an adversarial divergence with some additional conditions, then using a restricted discriminator family has a moment-matching effect, thus generalizing previous results.

Generative Adversarial Nets

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a

MMD GAN: Towards Deeper Understanding of Moment Matching Network

In the evaluation on multiple benchmark datasets, including MNIST, CIFAR- 10, CelebA and LSUN, the performance of MMD-GAN significantly outperforms GMMN, and is competitive with other representative GAN works.

AdaGAN: Boosting Generative Models

An iterative procedure, called AdaGAN, is proposed, where at every step the authors add a new component into a mixture model by running a GAN algorithm on a re-weighted sample by inspired by boosting algorithms.

Improved Techniques for Training GANs

This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.