FUDGE: Controlled Text Generation With Future Discriminators

@article{Yang2021FUDGECT,
  title={FUDGE: Controlled Text Generation With Future Discriminators},
  author={Kevin Yang and Dan Klein},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.05218}
}
We propose Future Discriminators for Generation (FUDGE), a flexible and modular method for controlled text generation. Given a pre-existing model G for generating text from a distribution of interest, FUDGE enables conditioning on a desired attribute a (for example, formality) while requiring access only to G’s output logits. FUDGE learns an attribute predictor operating on a partial sequence, and uses this predictor’s outputs to adjust G’s original probabilities. We show that FUDGE models… 

Director: Generator-Classifiers For Supervised Language Modeling

A new architecture, Director, that consists of a unified generator-classifier with both a language modeling and a classification head for each output token that outperforms existing model guiding approaches in terms of both accuracy and efficiency.

Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces

This work proposes a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function; and generates samples by initializing the entire output sequence with noise and following a Markov chain defined by Langevin Dynamics using the gradients of this energy.

Improving Controllable Text Generation with Position-Aware Weighted Decoding

A novel framework based on existing weighted decoding methods called CAT-PAW is proposed, which introduces a lightweight regulator to adjust bias signals from the controller at different decoding positions to solve the control strength/fluency trade-off problem.

SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks

A novel approach to control the generation of Transformer-based pre-trained language models is proposed: the S IDE C ONTROL framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples.

The CRINGE Loss: Learning what language not to model

This work proposes a novel procedure to train with negative data called the C RINGE loss (ContRastive Iterative Negative GEneration), and shows the effectiveness of this approach across three different experiments on the tasks of safe generation, contradiction avoidance, and open-domain dialogue.

DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation

A new CTG approach, namely DisCup, is proposed, which in-corporates the attribute knowledge of discriminator to optimize the control-prompts, steering a frozen CLM to produce attribute-specific texts.

Composable Text Controls in Latent Space with ODEs

A new efficient approach for composable text operations in the compact latent space of text based on ordinary differential equations given arbitrary plug-in operators, which permits diverse control operators acquired using any relevant data from different domains.

Offline RL for Natural Language Generation with Implicit Language Q Learning

This work proposes a novel offline RL motivated method, implicit language Q-learning (ILQL), designed for use on language models, that combines both the flexible utility optimization framework of traditional RL algorithms with supervised learning’s ability to leverage existing data and its simplicity and stability.

Controllable Text Generation with Neurally-Decomposed Oracle

A general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO), and presents the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation.

Diffusion-LM Improves Controllable Text Generation

A new non-autoregressive language model based on continuous diffusions that iteratively denoises a sequence of Gaussian vectors into word vectors, yielding a sequences of intermediate latent variables that enables a simple gradient-based algorithm to perform complex, controllable generation tasks.
...

References

SHOWING 1-10 OF 48 REFERENCES

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

The Plug and Play Language Model (PPLM) for controllable language generation is proposed, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM.

GeDi: Generative Discriminator Guided Sequence Generation

GeDi is proposed as an efficient method for using smaller LMs as generative discriminators to guide generation from large LMs to make them safer and more controllable, and is found that GeDi gives stronger controllability than the state of the art method while also achieving generation speeds more than 30 times faster.

Fluent Translations from Disfluent Speech in End-to-End Speech Translation

This work uses a sequence-to-sequence model to translate from noisy, disfluent speech to fluent text with disfluencies removed using the recently collected ‘copy-edited’ references for the Fisher Spanish-English dataset.

Language Models are Unsupervised Multitask Learners

It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.

Marian: Fast Neural Machine Translation in C++

Marian is an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs that can achieve high training and translation speed.

The cmu pronouncing dictionary

  • URL: http://www. speech. cs. cmu. edu/cgibin/cmudict.
  • 1998

Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation

The Style Transformer is proposed, which makes no assumption about the latent representation of source sentence and equips the power of attention mechanism in Transformer to achieve better style transfer and better content preservation.

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.

Hafez: an Interactive Poetry Generation System

Hafez is an automatic poetry generation system that integrates a Recurrent Neural Network (RNN) with a Finite State Acceptor (FSA) and learns to adjust its parameters to improve poetry quality.

Data Boost: Text Data Augmentation through Reinforcement Learning Guided Conditional Generation

This paper presents a powerful and easy to deploy text augmentation framework, Data Boost, which augments data through reinforcement learning guided conditional generation and evaluates Data Boost on three diverse text classification tasks under five different classifier architectures.