• Corpus ID: 231861414

On the Existence of Optimal Transport Gradient for Learning Generative Models

@article{Houdard2021OnTE,
  title={On the Existence of Optimal Transport Gradient for Learning Generative Models},
  author={Antoine Houdard and Arthur Leclaire and Nicolas Papadakis and Julien Rabin},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.05542}
}
The use of optimal transport cost for learning generative models has become popular with Wasserstein Generative Adversarial Networks (WGAN). Training of WGAN relies on a theoretical background: the calculation of the gradient of the optimal transport cost with respect to the generative model parameters. We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization. We address this issue by stating a valid… 
1 Citations

Figures from this paper

Kantorovich Strikes Back! Wasserstein GANs are not Optimal Transport?

This paper constructs 1-Lipschitz functions and uses them to build ray monotone transport plans and thoroughly evaluates popular WGAN dual form solvers using these benchmark pairs, suggesting that these solvers should not be treated as good estimators of W 1, but to some extent they indeed can be used in variational problems requiring the minimization of W 2.

References

SHOWING 1-10 OF 28 REFERENCES

On the Convergence and Robustness of Training GANs with Regularized Optimal Transport

This work shows that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective.

Wasserstein-2 Generative Networks

This paper proposes a novel end-to-end algorithm for training generative models which uses a non-minimax objective simplifying model training and uses the approximation of Wasserstein-2 distance by Input Convex Neural Networks.

A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization

This paper provides a simple procedure to fit generative networks to target distributions, with the goal of a small Wasserstein distance (or other optimal transport costs). The approach is based on

Generative Adversarial Nets

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a

A Two-Step Computation of the Exact GAN Wasserstein Distance

This approach optimizes the exact Wasserstein distance, obviating the need for weight clipping previously used in WGANs, and theoretically proves that the proposed formulation is equivalent to the discrete MongeKantorovich dual formulation.

Improved Training of Wasserstein GANs

This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.

Stochastic Optimization for Large-scale Optimal Transport

A new class of stochastic optimization algorithms to cope with large-scale problems routinely encountered in machine learning applications, based on entropic regularization of the primal OT problem, which results in a smooth dual optimization optimization which can be addressed with algorithms that have a provably faster convergence.

Learning Generative Models with Sinkhorn Divergences

This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles three issues by relying on two key ideas: entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; and algorithmic (automatic) differentiation of these iterations.

Large Scale Optimal Transport and Mapping Estimation

This paper proposes a stochastic dual approach of regularized OT, and shows empirically that it scales better than a recent related approach when the amount of samples is very large, and estimates a Monge map as a deep neural network learned by approximating the barycentric projection of the previously-obtained OT plan.

Differentiable Augmentation for Data-Efficient GAN Training

DiffAugment is a simple method that improves the data efficiency of GANs by imposing various types of differentiable augmentations on both real and fake samples, and can generate high-fidelity images using only 100 images without pre-training, while being on par with existing transfer learning algorithms.