Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models

  title={Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models},
  author={Peter West and Ximing Lu and Ari Holtzman and Chandra Bhagavatula and Jena D. Hwang and Yejin Choi},
Publicly available, large pretrained Language Models (LMs) generate text with remarkable quality, but only sequentially from left to right. As a result, they are not immediately applicable to generation tasks that break the unidirectional assumption, such as paraphrasing or text-infilling, necessitating task-specific supervision. In this paper, we present Reflective Decoding, a novel unsupervised algorithm that allows for direct application of unidirectional LMs to non-sequential tasks. Our 2… 

Flexible Generation from Fragmentary Linguistic Input

The hypothesis that human behavior in novel language tasks and environments may be better characterized by flexible composition of basic computational motifs rather than by direct specialization is supported.

InCoder: A Generative Model for Code Infilling and Synthesis

INCODER is introduced, a unified generative model that can perform program synthesis (via left-to-right generation) as well as editing (via infilling) and the ability to condition on bidirectional context substantially improves performance on challenging tasks such as type inference, comment generation, and variable re-naming.

Knowledge Infused Decoding

Knowledge Infused Decoding (KID)—a novel decoding algorithm for generative LMs, which dynamically infuses external knowledge into each step of the LM decoding, which maintains a local knowledge memory based on the current context, interacting with a dynamically created external knowledge trie, and continuously update the local memory as a knowledge-aware constraint to guide decoding via reinforcement learning.

Deduplicating Training Data Mitigates Privacy Risks in Language Models

The rate at which language models regenerate training sequences is superlinearly related to a sequence’s count in the training set and it is found that after applying methods to deduplicate training data, language models are considerably more secure against privacy attacks.

Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning

The effectiveness of the proposed approaches for retaining the semantic content of the original text while inducing lexical novelty in the generation of paraphrase generation is demonstrated.

NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

NeuroLogic A*esque is proposed, a decoding algorithm that incorporates heuristic estimates of future cost that develops lookahead heuristics that are efficient for large-scale language models, making this method a drop-in replacement for common techniques such as beam search and top-k sampling.



Generating Sentences from Disentangled Syntactic and Semantic Spaces

The proposed method explicitly models syntactic information in the VAE’s latent space by using the linearized tree sequence, leading to better performance of language generation and the advantage of sampling in the disentangled syntactic and semantic latent spaces.

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

This investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs, and suggests that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.

Findings of the 2019 Conference on Machine Translation (WMT19)

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any

BERTScore: Evaluating Text Generation with BERT

This work proposes BERTScore, an automatic evaluation metric for text generation that correlates better with human judgments and provides stronger model selection performance than existing metrics.

Defending Against Neural Fake News

A model for controllable text generation called Grover, found that best current discriminators can classify neural fake news from real, human-written, news with 73% accuracy, assuming access to a moderate level of training data, and the best defense against Grover turns out to be Grover itself, with 92% accuracy.

ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

This work uses ParaNMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs, to train paraphrastic sentence embeddings that outperform all supervised systems on every SemEval semantic textual similarity competition, in addition to showing how it can be used for paraphrase generation.

Attention is All you Need

A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

Optimizing Statistical Machine Translation for Text Simplification

This work is the first to design automatic metrics that are effective for tuning and evaluating simplification systems, which will facilitate iterative development for this task.

Unsupervised REFLECTIVE DECODING She had problems and she needed help

    Sam didn't sleep well last night