Corpus ID: 221665105

Learning to summarize from human feedback

@article{Stiennon2020LearningTS,
  title={Learning to summarize from human feedback},
  author={Nisan Stiennon and Long Ouyang and Jeff Wu and Daniel M. Ziegler and Ryan J. Lowe and Chelsea Voss and Alec Radford and Dario Amodei and Paul Christiano},
  journal={ArXiv},
  year={2020},
  volume={abs/2009.01325}
}
As language models become more powerful, training and evaluation are increasingly bottlenecked by the data and metrics used for a particular task. For example, summarization models are often trained to predict human reference summaries and evaluated using ROUGE, but both of these metrics are rough proxies for what we really care about---summary quality. In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. We… Expand
A Survey of Human-in-the-loop for Machine Learning
TLDR
This survey intends to provide a high-level summarization of major approaches in the field, along with their technical strengths/ weaknesses, and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions. Expand
Alignment of Language Agents
TLDR
Within AI research, large language models have recently shown improved performance on certain metrics and in generating text that seems informally impressive, and may soon see the application of advanced language systems in many diverse and important settings. Expand
When Combating Hype, Proceed with Caution
TLDR
This paper urges researchers to be careful about false claims about the capabilities of state-of-the-art language technology and suggests some research directions and communication strategies that will make it easier to avoid or rebut them. Expand
Recursively Summarizing Books with Human Feedback
TLDR
This method combines learning from human feedback with recursive task decomposition: it uses models trained on smaller parts of the task to assist humans in giving feedback on the broader task, and generates sensible summaries of entire books. Expand
A Survey of Collaborative Reinforcement Learning: Interactive Methods and Design Patterns
TLDR
A long-term, in-depth survey, investigating human- AI collaborative methods based on both interactive reinforcement learning algorithms and human-AI collaborative frameworks, between 2011 and 2020, is provided. Expand
An Empirical Investigation of Learning from Biased Toxicity Labels
TLDR
While it is found that initial training on all of the data and fine-tuning on clean data produces models with the highest AUC, it is also found that no single strategy performs best across all fairness metrics. Expand
Calibrate your listeners! Robust communication-based training for pragmatic speakers
  • Rose E. Wang, Julia White, Jesse Mu, Noah D. Goodman
  • Computer Science
  • ArXiv
  • 2021
TLDR
It is shown that language drift originates from the poor uncertainty calibration of a neural listener, which makes high-certainty predictions on novel sentences, and an ensemble method with better calibration enables the speaker to generate pragmatic utterances while scaling to a large vocabulary and generalizing to new games and listeners. Expand
Cogment: Open Source Framework For Distributed Multi-actor Human-AI Collaborative Environment
In this work, we introduce Cogment an opensource framework that introduces an actor formalism to support a variety of humans / agents collaboration topologies and training approaches through anExpand
Cogment: Open Source Framework For Distributed Multi-actor Training, Deployment & Operations
TLDR
Cogment is presented, a unifying open-source framework that introduces an actor formalism to support a variety of humans-agents collaboration typologies and training approaches and offers solutions to the aforementioned complexities. Expand
Constrained Text Generation with Global Guidance - Case Study on CommonGen
TLDR
This paper considers using reinforcement learning to address the limitation of constrained text generation, measuring global constraints including fluency, common sense and concept coverage with a comprehensive score, which serves as the reward for reinforcement learning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 83 REFERENCES
Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics
TLDR
Two new objective automatic evaluation methods for machine translation based on longest common subsequence between a candidate translation and a set of reference translations and relaxes strict n-gram matching to skip-bigram matching are described. Expand
Better Rewards Yield Better Summaries: Learning to Summarise Without References
TLDR
This work learns a reward function from human ratings on 2,500 summaries that can be used to train RL based summarisation systems without using any reference summaries, and shows that the learned rewards have significantly higher correlation with human ratings than previous approaches. Expand
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries. Expand
Scalable agent alignment via reward modeling: a research direction
TLDR
This work outlines a high-level research direction to solve the agent alignment problem centered around reward modeling: learning a reward function from interaction with the user and optimizing the learned reward function with reinforcement learning. Expand
TL;DR: Mining Reddit to Learn Automatic Summarization
TLDR
This work proposes a new method for mining social media for author-provided summaries, taking advantage of the common practice of appending a “TL;DR” to long posts, and yields the Webis-TLDR-17 dataset. Expand
Language Models are Few-Shot Learners
TLDR
GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. Expand
Fine-Tuning Language Models from Human Preferences
TLDR
This paper builds on advances in generative pretraining of language models to apply reward learning to four natural language tasks: continuing text with positive sentiment or physically descriptive language, and summarization tasks on the TL;DR and CNN/Daily Mail datasets. Expand
Improving Language Understanding by Generative Pre-Training
TLDR
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. Expand
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
Proximal Policy Optimization Algorithms
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objectiveExpand
...
1
2
3
4
5
...