On the Existence of Optimal Transport Gradient for Learning Generative Models
@article{Houdard2021OnTE, title={On the Existence of Optimal Transport Gradient for Learning Generative Models}, author={Antoine Houdard and Arthur Leclaire and Nicolas Papadakis and Julien Rabin}, journal={ArXiv}, year={2021}, volume={abs/2102.05542} }
The use of optimal transport cost for learning generative models has become popular with Wasserstein Generative Adversarial Networks (WGAN). Training of WGAN relies on a theoretical background: the calculation of the gradient of the optimal transport cost with respect to the generative model parameters. We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization. We address this issue by stating a valid…
One Citation
Kantorovich Strikes Back! Wasserstein GANs are not Optimal Transport?
- Computer ScienceArXiv
- 2022
This paper constructs 1-Lipschitz functions and uses them to build ray monotone transport plans and thoroughly evaluates popular WGAN dual form solvers using these benchmark pairs, suggesting that these solvers should not be treated as good estimators of W 1, but to some extent they indeed can be used in variational problems requiring the minimization of W 2.
References
SHOWING 1-10 OF 28 REFERENCES
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
- Computer ScienceNeurIPS
- 2018
This work shows that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective.
Wasserstein-2 Generative Networks
- Computer ScienceICLR
- 2021
This paper proposes a novel end-to-end algorithm for training generative models which uses a non-minimax objective simplifying model training and uses the approximation of Wasserstein-2 distance by Input Convex Neural Networks.
A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization
- Computer ScienceICML
- 2019
This paper provides a simple procedure to fit generative networks to target distributions, with the goal of a small Wasserstein distance (or other optimal transport costs). The approach is based on…
Generative Adversarial Nets
- Computer ScienceNIPS
- 2014
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a…
A Two-Step Computation of the Exact GAN Wasserstein Distance
- Computer ScienceICML
- 2018
This approach optimizes the exact Wasserstein distance, obviating the need for weight clipping previously used in WGANs, and theoretically proves that the proposed formulation is equivalent to the discrete MongeKantorovich dual formulation.
Improved Training of Wasserstein GANs
- Computer ScienceNIPS
- 2017
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.
Stochastic Optimization for Large-scale Optimal Transport
- Computer ScienceNIPS
- 2016
A new class of stochastic optimization algorithms to cope with large-scale problems routinely encountered in machine learning applications, based on entropic regularization of the primal OT problem, which results in a smooth dual optimization optimization which can be addressed with algorithms that have a provably faster convergence.
Learning Generative Models with Sinkhorn Divergences
- Computer ScienceAISTATS
- 2018
This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles three issues by relying on two key ideas: entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; and algorithmic (automatic) differentiation of these iterations.
Large Scale Optimal Transport and Mapping Estimation
- Computer Science, MathematicsICLR
- 2018
This paper proposes a stochastic dual approach of regularized OT, and shows empirically that it scales better than a recent related approach when the amount of samples is very large, and estimates a Monge map as a deep neural network learned by approximating the barycentric projection of the previously-obtained OT plan.
Differentiable Augmentation for Data-Efficient GAN Training
- Computer ScienceNeurIPS
- 2020
DiffAugment is a simple method that improves the data efficiency of GANs by imposing various types of differentiable augmentations on both real and fake samples, and can generate high-fidelity images using only 100 images without pre-training, while being on par with existing transfer learning algorithms.