Corpus ID: 237593001

Recursively Summarizing Books with Human Feedback

  title={Recursively Summarizing Books with Human Feedback},
  author={Jeff Wu and Long Ouyang and Daniel M. Ziegler and Nissan Stiennon and Ryan Lowe and Jan Leike and Paul Francis Christiano},
A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate. We present progress on this problem on the task of abstractive summarization of entire fiction novels. Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist humans in giving feedback on the broader task. We collect a large volume of demonstrations and… Expand

Figures and Tables from this paper

DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization
A new approach for long-input summarization: Dynamic Latent Extraction for Abstractive Summarization jointly train an extractor with an abstractor and treat the extracted text snippets as the latent variable, which introduces consistency loss and makes the generation process highly interpretable. Expand
Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents
A simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context lengths of typical pretrained LMs, SUMM first generates the coarse summary in multiple stages and then produces the final fine-grained summary based on them. Expand
Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research
A review of existing IS literature reveals that suboptimal text mining techniques are prevalent and that the more advanced TLMs could be applied to enhance and increase IS research involving text data, and to enable new IS research topics, thus creating more value for the research community. Expand
When Combating Hype, Proceed with Caution
This paper urges researchers to be careful about false claims about the capabilities of state-of-the-art language technology and suggests some research directions and communication strategies that will make it easier to avoid or rebut them. Expand
Finetuned Language Models Are Zero-Shot Learners
It is shown that instruction tuning—finetuning language models on a collection of datasets described via instructions—substantially boosts zeroshot performance on unseen tasks and FLAN substantially improves the performance of its unmodified counterpart and surpasses zero-shot 175B GPT-3 on 20 of 25 datasets that are evaluated. Expand


Fine-Tuning Language Models from Human Preferences
This paper builds on advances in generative pretraining of language models to apply reward learning to four natural language tasks: continuing text with positive sentiment or physically descriptive language, and summarization tasks on the TL;DR and CNN/Daily Mail datasets. Expand
Better Rewards Yield Better Summaries: Learning to Summarise Without References
This work learns a reward function from human ratings on 2,500 summaries that can be used to train RL based summarisation systems without using any reference summaries, and shows that the learned rewards have significantly higher correlation with human ratings than previous approaches. Expand
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
A simple extractive step is performed before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with Generating a summary. Expand
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
This work proposes SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. Expand
Text Summarization with Pretrained Encoders
This paper introduces a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences and proposes a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two. Expand
On Generating Extended Summaries of Long Documents
This paper exploits hierarchical structure of the documents and incorporates it into an extractive summarization model through a multi-task learning approach and shows that the multi-tasking approach can adjust extraction probability distribution to the favor of summary-worthy sentences across diverse sections. Expand
An Actor-Critic Algorithm for Sequence Prediction
An approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL) that condition the critic network on the ground-truth output, and shows that this method leads to improved performance on both a synthetic task, and for German-English machine translation. Expand
SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
This paper proposes SEAL, a Transformer-based model, featuring a new encoder-decoder attention that dynamically extracts/selects input snippets to sparsely attend to for each output segment, and achieves state-of-the-art results on existing long-form summarization tasks. Expand
The NarrativeQA Reading Comprehension Challenge
A new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts are presented, designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. Expand