FUDGE: Controlled Text Generation With Future Discriminators

@article{Yang2021FUDGECT,
  title={FUDGE: Controlled Text Generation With Future Discriminators},
  author={Kevin Yang and Dan Klein},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.05218}
}
We propose Future Discriminators for Generation (FUDGE), a flexible and modular method for controlled text generation. Given a pre-existing model G for generating text from a distribution of interest, FUDGE enables conditioning on a desired attribute a (for example, formality) while requiring access only to G’s output logits. FUDGE learns an attribute predictor operating on a partial sequence, and uses this predictor’s outputs to adjust G’s original probabilities. We show that FUDGE models… 

DIRECTOR: Generator-Classifiers For Supervised Language Modeling

TLDR
This paper introduces a new architecture, D I RECTOR, that consists of a unified generator-classi fier with both a language modeling and a classi-cation head for each output token, and shows that the model has competitive training and decoding speed compared to standard language models while yielding superior results, alleviating known issues while maintaining generation quality.

Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces

TLDR
This work proposes a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function; and generates samples by initializing the entire output sequence with noise and following a Markov chain defined by Langevin Dynamics using the gradients of this energy.

Improving Controllable Text Generation with Position-Aware Weighted Decoding

TLDR
A novel framework based on existing weighted decoding methods called CAT-PAW is proposed, which introduces a lightweight regulator to adjust bias signals from the controller at different decoding positions to solve the control strength/fluency trade-off problem.

SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks

TLDR
A novel approach to control the generation of Transformer-based pre-trained language models is proposed: the S IDE C ONTROL framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples.

Composable Text Controls in Latent Space with ODEs

TLDR
A new efficient approach for composable text operations in the compact latent space of text based on ordinary differential equations given arbitrary plug-in operators, which permits diverse control operators acquired using any relevant data from different domains.

Offline RL for Natural Language Generation with Implicit Language Q Learning

TLDR
This work proposes a novel offline RL motivated method, implicit language Q-learning (ILQL), designed for use on language models, that combines both the flexible utility optimization framework of traditional RL algorithms with supervised learning’s ability to leverage existing data and its simplicity and stability.

Controllable Text Generation with Neurally-Decomposed Oracle

TLDR
A general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO) guides the base model towards the given oracle while maintaining high generation quality.

Diffusion-LM Improves Controllable Text Generation

TLDR
A new non-autoregressive language model based on continuous diffusions that iteratively denoises a sequence of Gaussian vectors into word vectors, yielding a sequences of intermediate latent variables that enables a simple gradient-based algorithm to perform complex, controllable generation tasks.

Mix and Match: Learning-free Controllable Text Generationusing Energy Language Models

TLDR
This work proposes Mix and Match LM, a global score-based alternative for controllable text generation that combines arbitrary pre-trained black- box models for achieving the desired attributes in the generated text without involving any fine-tuning or structural assumptions about the black-box models.

Controlled Text Generation as Continuous Optimization with Multiple Constraints

TLDR
This work forms the decoding process as an optimization problem which allows for multiple attributes it aims to control to be easily incorporated as differentiable constraints to the optimization by relaxing this discrete optimization to a continuous one.

References

SHOWING 1-10 OF 48 REFERENCES

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

TLDR
The Plug and Play Language Model (PPLM) for controllable language generation is proposed, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM.

GeDi: Generative Discriminator Guided Sequence Generation

TLDR
GeDi is proposed as an efficient method for using smaller LMs as generative discriminators to guide generation from large LMs to make them safer and more controllable, and is found that GeDi gives stronger controllability than the state of the art method while also achieving generation speeds more than 30 times faster.

Fluent Translations from Disfluent Speech in End-to-End Speech Translation

TLDR
This work uses a sequence-to-sequence model to translate from noisy, disfluent speech to fluent text with disfluencies removed using the recently collected ‘copy-edited’ references for the Fisher Spanish-English dataset.

Language Models are Unsupervised Multitask Learners

TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.

Marian: Fast Neural Machine Translation in C++

TLDR
Marian is an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs that can achieve high training and translation speed.

The cmu pronouncing dictionary

  • URL: http://www. speech. cs. cmu. edu/cgibin/cmudict.
  • 1998

Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation

TLDR
The Style Transformer is proposed, which makes no assumption about the latent representation of source sentence and equips the power of attention mechanism in Transformer to achieve better style transfer and better content preservation.

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

TLDR
This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.

Hafez: an Interactive Poetry Generation System

TLDR
Hafez is an automatic poetry generation system that integrates a Recurrent Neural Network (RNN) with a Finite State Acceptor (FSA) and learns to adjust its parameters to improve poetry quality.

Data Boost: Text Data Augmentation through Reinforcement Learning Guided Conditional Generation

TLDR
This paper presents a powerful and easy to deploy text augmentation framework, Data Boost, which augments data through reinforcement learning guided conditional generation and evaluates Data Boost on three diverse text classification tasks under five different classifier architectures.