Corpus ID: 236469562

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

  title={Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation},
  author={Yufei Wang and Can Xu and Huang Hu and Chongyang Tao and Stephen Wan and Mark Dras and Mark Johnson and Daxin Jiang},
  • Yufei Wang, Can Xu, +5 authors Daxin Jiang
  • Published 28 July 2021
  • Computer Science
  • ArXiv
Sequence-to-Sequence (Seq2Seq) neural text generation models, especially the pre-trained ones (e.g., BART and T5), have exhibited compelling performance on various natural language generation tasks. However, the black-box nature of these models limits their application in tasks where specific rules (e.g., controllable constraints, prior knowledge) need to be executed. Previous works either design specific model structures (e.g., Copy Mechanism corresponding to the rule “the generated output… Expand


Plug and Play Language Models: A Simple Approach to Controlled Text Generation
The Plug and Play Language Model (PPLM) for controllable language generation is proposed, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. Expand
CTRL: A Conditional Transformer Language Model for Controllable Generation
CTRL is released, a 1.63 billion-parameter conditional transformer language model, trained to condition on control codes that govern style, content, and task-specific behavior, providing more explicit control over text generation. Expand
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks. Expand
Incorporating Copying Mechanism in Sequence-to-Sequence Learning
This paper incorporates copying into neural network-based Seq2Seq learning and proposes a new model called CopyNet with encoder-decoder structure which can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence. Expand
Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, it can be used to achieve significant gains in performance in domain adaptation scenarios. Expand
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
Attention is All you Need
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. Expand
Neural Machine Translation with External Phrase Memory
In this paper, we propose phraseNet, a neural machine translator with a phrase memory which stores phrase pairs in symbolic form, mined from corpus or specified by human experts. For any given sourceExpand
Modeling Coverage for Neural Machine Translation
This paper proposes coverage-based NMT, which maintains a coverage vector to keep track of the attention history and improves both translation quality and alignment quality over standard attention- based NMT. Expand
Improving Neural Machine Translation through Phrase-based Forced Decoding
A soft forced decoding algorithm is proposed, which can always successfully find a decoding path for any NMT output, and it is shown that using the forced decoding cost to rerank the NMT outputs can successfully improve translation quality on four different language pairs. Expand