Corpus ID: 219636023

Improving GAN Training with Probability Ratio Clipping and Sample Reweighting

  title={Improving GAN Training with Probability Ratio Clipping and Sample Reweighting},
  author={Yue Wu and Pan Zhou and Andrew Gordon Wilson and Eric P. Xing and Zhiting Hu},
  • Yue Wu, Pan Zhou, +2 authors Zhiting Hu
  • Published 12 June 2020
  • Mathematics, Computer Science
  • ArXiv
Despite success on a wide range of problems related to vision, generative adversarial networks (GANs) can suffer from inferior performance due to unstable training, especially for text generation. We propose a new variational GAN training framework which enjoys superior training stability. Our approach is inspired by a connection of GANs and reinforcement learning under a variational perspective. The connection leads to (1) probability ratio clipping that regularizes generator training to… Expand

Figures and Tables from this paper

Math Word Problem Generation with Mathematical Consistency and Problem Context Constraints
A novel MWP generation approach is developed that leverages i) pre-trained language models and a context keyword selection model to improve the language quality of the generated MWPs and ii) an equation consistency constraint for math equations to improved the mathematical validity of thegenerated MWPs. Expand
EXoN: EXplainable encoder Network
It is found that both negative cross-entropy and Kullback-Leibler divergence play a crucial role in constructing explainable latent space and the variability of the generated samples from the proposed model depends on a specific subspace, called ‘activated latent subspace’. Expand
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed
A novel connection between knowledge distillation and image generation is established with a technique that distills a multi-step denoising process into a single step, resulting in a sampling speed similar to other single-step generative models. Expand
Panoramic Learning with A Standardized Machine Learning Formalism
A standardized ML formalism is presented, in particular a standard equation of the learning objective, that offers a unifying understanding of diverse ML algorithms, making them special cases due to different choices of modeling components. Expand
Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks
This work proposes a simple yet effective method to diagnose and emphasize underrepresented samples during training of a GAN, using the statistics of the discrepancy between the data distribution and the model distribution at each data instance. Expand
Solve Minimax Optimization by Anderson Acceleration
A new minimax optimization framework, GDA-AM, is proposed that views the GDA dynamics as a fixed-point iteration and solves it using Anderson Mixing to converge to the local minimax. Expand
Symbolic Music Generation with Transformer-GANs
It is demonstrated via human evaluations and a new discriminative metric that the music generated by the approach outperforms a baseline trained with likelihood maximization, the state-of-the-art Music Transformer, and other GANs used for sequence generation. Expand
Unaligned Image-to-Image Translation by Learning to Reweight
  • Shaoan Xie, Mingming Gong, Yanwu Xu, Kun Zhang
  • Computer Science
  • ArXiv
  • 2021
This paper proposes to select images based on importance reweighting and develop a method to learn the weights and perform translation simultaneously and automatically and compares the proposed method with state-of-the-art image translation approaches and presents qualitative and quantitative results on different tasks with unaligned domains. Expand
A Systematic Survey of Regularization and Normalization in GANs
  • Ziqiang Li, Xintian Wu, +4 authors Bin Li
  • Computer Science, Engineering
  • 2020
A comprehensive survey on the regularization and normalization techniques from different perspectives of GANs training is conducted and a new taxonomy is proposed based on these objectives, which are compared to the performance of the mainstream methods on different datasets and investigate theRegularization andnormalization techniques that have been frequently employed in SOTA GAns. Expand
Learning from All Types of Experiences: A Unifying Machine Learning Perspective
This tutorial presents a systematic, unified blueprint of ML, for both a refreshing holistic understanding of the diverse ML paradigms/algorithms, and guidance of operationalizing ML for creating problem solutions in a composable manner. Expand


Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
  • 2017
Style Transfer for Texts: Retrain, Report Errors, Compare with Rewrites
This paper shows that standard assessment methodology for style transfer has several significant problems, and suggests taking BLEU between input and human-written reformulations into consideration for benchmarks, and proposes three new architectures that outperform state of the art in terms of this metric. Expand
Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect
This paper proposes a novel approach to enforcing the Lipschitz continuity in the training procedure of WGANs, which gives rise to not only better photo-realistic samples than the previous methods but also state-of-the-art semi-supervised learning results. Expand
Improved Training of Wasserstein GANs
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning. Expand
Trust Region Policy Optimization
A method for optimizing control policies, with guaranteed monotonic improvement, by making several approximations to the theoretically-justified scheme, called Trust Region Policy Optimization (TRPO). Expand
Long Text Generation via Adversarial Training with Leaked Information
The discriminative net is allowed to leak its own high-level extracted features to the generative net to further help the guidance, and without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker. Expand
Maximum a Posteriori Policy Optimisation
This work introduces a new algorithm for reinforcement learning called Maximum aposteriori Policy Optimisation (MPO) based on coordinate ascent on a relative entropy objective and develops two off-policy algorithms that are competitive with the state-of-the-art in deep reinforcement learning. Expand
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
This work proposes a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions and introduces the "Frechet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score. Expand
Convolutional Deep Belief Networks on CIFAR-10
We describe how to train a two-layer convolutional Deep Belief Network (DBN) on the 1.6 million tiny images dataset. When training a convolutional DBN, one must decide what to do with the edge pixelsExpand