Corpus ID: 201666234

Release Strategies and the Social Impacts of Language Models

@article{Solaiman2019ReleaseSA,
  title={Release Strategies and the Social Impacts of Language Models},
  author={Irene Solaiman and Miles Brundage and Jack Clark and Amanda Askell and Ariel Herbert-Voss and Jeff Wu and Alec Radford and Jasmine Wang},
  journal={ArXiv},
  year={2019},
  volume={abs/1908.09203}
}
Large language models have a range of beneficial uses: they can assist in prose, poetry, and programming; analyze dataset biases; and more. However, their flexibility and generative capabilities also raise misuse concerns. This report discusses OpenAI's work related to the release of its GPT-2 language model. It discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased. It also discusses ongoing partnership-based research… Expand
The Radicalization Risks of GPT-3 and Advanced Neural Language Models
TLDR
GPT-3 demonstrates significant improvement over its predecessor, GPT-2, in generating extremist texts and its strength in generating text that accurately emulates interactive, informational, and influential content that could be utilized for radicalizing individuals into violent far-right extremist ideologies and behaviors. Expand
The workweek is the best time to start a family – A Study of GPT-2 Based Claim Generation
TLDR
A pipeline based on GPT-2 for generating coherent claims is suggested, and the types of claims that it produces, and their veracity, are explored, using an array of manual and automatic assessments. Expand
Limits of Detecting Text Generated by Large-Scale Language Models
TLDR
This work forms large-scale language model output detection as a hypothesis testing problem to classify text as genuine or generated, and shows that error exponents for particular language models are bounded in terms of their perplexity, a standard measure of language generation performance. Expand
What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl Corpus
TLDR
It is found that the Common Crawl, a colossal web corpus that is extensively used for training language models, contains a significant amount of undesirable content, including hate speech and sexually explicit content, even after filtering procedures. Expand
Viable Threat on News Reading: Generating Biased News Using Natural Language Models
TLDR
A threat model is used to demonstrate that the publicly available language models can reliably generate biased news content based on an input original news and it is shown that a large number of high-quality biased news articles can be generated using controllable text generation. Expand
Feature-based detection of automated language models: tackling GPT-2, GPT-3 and Grover
TLDR
This work proposes a simple feature-based classifier for the detection problem, using carefully crafted features that attempt to model intrinsic differences between human and machine text, offering an accessible “first line ofdefense” against the abuse of language models. Expand
How does fake news spread? Understanding pathways of disinformation spread through APIs
TLDR
How the stages in the framework were activated during the 2016 US Presidential Elections are highlighted, before providing policy recommendations for issues relating to access to APIs, algorithmic content, advertisements, and suggest rapid response to coordinate campaigns. Expand
The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers
TLDR
An in-depth analysis of the impact of multi-word suggestion choices from a neural language model on user behaviour regarding input and text composition in email writing reveals benefits for ideation, and costs for efficiency, when suggesting multiple phrases. Expand
Belief-based Generation of Argumentative Claims
TLDR
This work proposes the task of belief-based claim generation, and study the research question of how to model and encode a user’s beliefs into a generated argumentative text, and extends state of the art text generation models with extra input reflecting users’ beliefs. Expand
Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
TLDR
This work demonstrates via human evaluation that classifiers trained to discriminate between human and machine-generated text emerge as unsupervised predictors of "page quality", able to detect low quality content without any training. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science
TLDR
It is argued that data statements will help alleviate issues related to exclusion and bias in language technology, lead to better precision in claims about how natural language processing research can generalize and thus better engineering results, protect companies from public embarrassment, and ultimately lead to language technology that meets its users in their own preferred linguistic style. Expand
The Social Impact of Natural Language Processing
TLDR
A number of social implications of NLP are identified and discussed and their ethical significance, as well as ways to address them are discussed. Expand
The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction
TLDR
It is shown that, in line with recent results in other NLP tasks, Transformer architectures achieve consistently high performance and provide a competitive baseline for future machine learning models. Expand
Attesting Biases and Discrimination using Language Semantics
TLDR
This work focuses on how to attest to whether AI agents treat users fairly without discriminating against particular individuals or groups through biases in language and outlines a roadmap for future research to better understand and attest problematic AI biases derived from language. Expand
GLTR: Statistical Detection and Visualization of Generated Text
TLDR
This work introduces GLTR, a tool to support humans in detecting whether a text was generated by a model, and shows that the annotation scheme provided by GLTR improves the human detection-rate of fake text from 54% to 72% without any prior training. Expand
Defending Against Neural Fake News
TLDR
A model for controllable text generation called Grover, found that best current discriminators can classify neural fake news from real, human-written, news with 73% accuracy, assuming access to a moderate level of training data, and the best defense against Grover turns out to be Grover itself, with 92% accuracy. Expand
Multi-turn Dialogue Response Generation with Autoregressive Transformer Models
TLDR
The use of autoregressive transformer models for multi-turn dialogue response generation and state-of-the-art performance on the two datasets based on several metrics, including BLEU, ROGUE, and distinct n-gram are examined. Expand
Reducing malicious use of synthetic media research: Considerations and potential release practices for machine learning
TLDR
Recommendations are suggested that the machine learning community might benefit from: working with subject matter experts to increase understanding of the risk landscape and possible mitigation strategies; building a community and norms around understanding the impacts of ML research, e.g. through regular workshops at major conferences. Expand
Hello, It's GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems
TLDR
This paper proposes a task-oriented dialogue model that operates solely on text input: it effectively bypasses explicit policy and language generation modules and holds promise to mitigate the data scarcity problem, and to support the construction of more engaging and more eloquent task- oriented conversational agents. Expand
What makes a good conversation? How controllable attributes affect human judgments
TLDR
This work examines two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking, and shows that by controlling combinations of these variables their models obtain clear improvements in human quality judgments. Expand
...
1
2
3
4
5
...