• Corpus ID: 165163834

Training language GANs from Scratch

@inproceedings{dAutume2019TrainingLG,
  title={Training language GANs from Scratch},
  author={Cyprien de Masson d'Autume and Mihaela Rosca and Jack W. Rae and Shakir Mohamed},
  booktitle={Neural Information Processing Systems},
  year={2019}
}
Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than… 

Figures and Tables from this paper

Generative Cooperative Networks for Natural Language Generation

Generative Cooperative Networks is introduced, in which the discriminator architecture is cooperatively used along with the generation policy to output samples of realistic texts for the task at hand, and theoretical guarantees of convergence are given.

InitialGAN: A Language GAN with Completely Random Initialization

This work proposes InitialGAN, the first time a language GAN can outperform MLE without using any pre-training techniques, and introduces a new evaluation metric, Least Coverage Rate, to better evaluate the quality of generated samples.

A Representation Modeling Based Language GAN with Completely Random Initialization

This work proposes InitialGAN, the first time a language GAN can outperform MLE without any pre-training techniques, and introduces a new evaluation metric, Least Coverage Rate, to better evaluate the quality of generated samples.

OptAGAN: Entropy-based finetuning on text VAE-GAN

This work combines the training of GANs in the latent space, with the finetuning of the decoder of Optimus for single word generation, and finetune using reinforcement learning (RL) by exploiting the structure of GPT-2 and by adding entropy-based intrinsically motivated rewards to balance between quality and diversity.

End-to-End Differentiable GANs for Text Generation

While this approach, without any pretraining is more stable while training and outperforms other GAN based approaches, it still falls behind MLE, it is found that this gap is due to autoregressive nature and architectural requirements for text generation as well as a fundamental difference between the definition of Wasserstein distance in image and text domains.

Language GANs Falling Short

The impact of exposure bias on sample quality is less severe than previously thought, and temperature tuning provides a better quality / diversity trade-off than adversarial training while being easier to train, easier to cross-validate, and less computationally expensive.

L ANGUAGE GAN S F ALLING S HORT

This work finds that 1) exposure bias appears to be less of an issue than the complications arising from non-differentiable, sequential GAN training; 2) MLE trained models provide a better quality/diversity tradeoff compared to their GAN counterparts, all while being easier to train, easier to cross-validate, and less computationally expensive.

Making Use of Latent Space in Language GANs for Generating Diverse Text without Pre-training

A GAN model that aims to improve the approach to generating diverse texts conditioned by the latent space and is quite competitive with the existing baseline models, which requires pre-training.

OPTAGAN: ENTROPY-BASED FINETUNING

This work combines the training of GANs in the latent space, with the finetuning of the decoder of Optimus for single word generation, and finetune using reinforcement learning (RL) by exploiting the structure of GPT-2 and by adding entropy-based intrinsically motivated rewards to balance between quality and diversity.

ColdGANs: Taming Language GANs with Cautious Sampling Strategies

For the first time, to the best of the knowledge, the proposed language GANs compare favorably to MLE, and obtain improvements over the state-of-the-art on three generative tasks, namely unconditional text generation, question generation, and abstractive summarization.
...

References

SHOWING 1-10 OF 57 REFERENCES

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

It is shown that recurrent neural networks can be trained to generate text with GANs from scratch by slowly teaching the model to generate sequences of increasing and variable length, which vastly improves the quality of generated sequences compared to a convolutional baseline.

Adversarial Ranking for Language Generation

This paper proposes a novel generative adversarial network, RankGAN, for generating high-quality language descriptions by viewing a set of data samples collectively and evaluating their quality through relative ranking scores, which helps to make better assessment which in turn helps to learn a better generator.

Improved Training of Wasserstein GANs

This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.

Evaluating Text GANs as Language Models

This work proposes to approximate the distribution of text generated by a GAN, which permits evaluating them with traditional probability-based LM metrics, and shows that they currently perform substantially worse than state-of-the-art LMs.

On Accurate Evaluation of GANs for Language Generation

It is demonstrated that the previously used BLEU score is not sensitive to semantic deterioration of generated texts and proposed alternative metrics that better capture the quality and diversity of the generated samples are proposed.

Adversarial Feature Matching for Text Generation

This work proposes a framework for generating realistic text via adversarial training, using a long short-term memory network as generator, and a convolutional network as discriminator, and proposes matching the high-dimensional latent feature distributions of real and synthetic sentences, via a kernelized discrepancy metric.

Maximum-Likelihood Augmented Discrete Generative Adversarial Networks

This work derives a novel and low-variance GAN objective using the discriminator's output that follows corresponds to the log-likelihood, which is proved to be consistent in theory and beneficial in practice.

Language GANs Falling Short

The impact of exposure bias on sample quality is less severe than previously thought, and temperature tuning provides a better quality / diversity trade-off than adversarial training while being easier to train, easier to cross-validate, and less computationally expensive.

Large Scale GAN Training for High Fidelity Natural Image Synthesis

It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.

Synthesizing Programs for Images using Reinforced Adversarial Learning

SPIRAL is an adversarially trained agent that generates a program which is executed by a graphics engine to interpret and sample images, and a surprising finding is that using the discriminator's output as a reward signal is the key to allow the agent to make meaningful progress at matching the desired output rendering.
...