• Corpus ID: 208617790

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

@article{Dathathri2020PlugAP,
  title={Plug and Play Language Models: A Simple Approach to Controlled Text Generation},
  author={Sumanth Dathathri and Andrea Madotto and Janice Lan and Jane Hung and Eric Frank and Piero Molino and Jason Yosinski and Rosanne Liu},
  journal={ArXiv},
  year={2020},
  volume={abs/1912.02164}
}
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a… 

Figures and Tables from this paper

A Plug-and-Play Method for Controlled Text Generation
TLDR
This work presents a plug-and-play decoding method for controlled language generation that is so simple and intuitive, it can be described in a single sentence: given a topic or keyword, a shift to the probability distribution over vocabulary towards semantically similar words is added and shown how annealing this distribution can be used to impose hard constraints on language generation.
Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
TLDR
This work proposes a simple and flexible method for controlling text generation by aligning disentangled attribute representations, and shows large performance gains over previous methods while retaining fluency and diversity.
Sentence Bottleneck Autoencoders from Transformer Language Models
TLDR
The construction of a sentence-level autoencoder from a pretrained, frozen transformer language model that achieves better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
TLDR
Directed Beam Search is proposed, a plug-and-play method for lexically constrained language generation that can be applied to any language model, is easy to implement and can be used for general language generation.
Extracting Latent Steering Vectors from Pretrained Language Models
TLDR
The results suggest that frozen LMs can be effectively controlled through their latent steering space, by extracting latent vectors directly from pretrained language model decoders without Tuning.
Control Prefixes for Parameter-Efficient Text Generation
TLDR
A dynamic method, C ON TROL P REFIXES, is proposed, which allows for the inclu-sion of conditional input-dependent information, combining the benefits of prompt tuning and controlled generation, and can even outperform fulltuning methods.
SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks
TLDR
A novel approach to control the generation of Transformer-based pre-trained language models is proposed: the S IDE C ONTROL framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples.
Controlled Text Generation as Continuous Optimization with Multiple Constraints
TLDR
This work forms the decoding process as an optimization problem which allows for multiple attributes to be easily incorporated as differentiable constraints to the optimization and makes use of Lagrangian multipliers and gradient-descent based techniques to generate the desired text.
Change or Not: A Simple Approach for Plug and Play Language Models on Sentiment Control
TLDR
PPLM (Dathathri et al. 2019) solves the conditional text generation problem without changing the architecture or weights of pre-trained LM but utilizing an external sentiment classifier to calculate loss, which is then backpropagated to the original LM’s hidden states at each time step.
Controllable Natural Language Generation with Contrastive Prefixes
TLDR
A novel lightweight framework for controllable GPT2 generation, which utilizes a set of small attribute-specific vectors, called prefixes, to steer natural language generation and Experimental results show that the methods can guide generation towards the desired attributes while keeping high linguistic quality.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 59 REFERENCES
CTRL: A Conditional Transformer Language Model for Controllable Generation
TLDR
CTRL is released, a 1.63 billion-parameter conditional transformer language model, trained to condition on control codes that govern style, content, and task-specific behavior, providing more explicit control over text generation.
Can Unconditional Language Models Recover Arbitrary Sentences?
TLDR
This work introduces a pair of effective complementary methods for feeding representations into pretrained unconditional language models and a corresponding set of methods to map sentences into and out of this representation space, the \textit{reparametrized sentence space}.
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Multiple-Attribute Text Rewriting
TLDR
This paper proposes a new model that controls several factors of variation in textual data where this condition on disentanglement is replaced with a simpler mechanism based on back-translation, and demonstrates that the fully entangled model produces better generations.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Simple Fusion: Return of the Language Model
TLDR
This work investigates an alternative simple method to use monolingual data for NMT training that combines the scores of a pre-trained and fixed language model (LM) with the Scores of a translation model (TM) while the TM is trained from scratch.
Improving Language Understanding by Generative Pre-Training
TLDR
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.
Transformer-XL: Attentive Language Models beyond a Fixed-Length Context
TLDR
This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.
Controllable Text Generation
TLDR
A new neural generative model is proposed which combines variational auto-encoders and holistic attribute discriminators for effective imposition of semantic structures inGeneric generation and manipulation of text.
Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer
TLDR
This paper proposes simpler methods motivated by the observation that text attributes are often marked by distinctive phrases, and the strongest method extracts content words by deleting phrases associated with the sentence’s original attribute value, retrieves new phrases associatedwith the target attribute, and uses a neural model to fluently combine these into a final output.
...
1
2
3
4
5
...