• Corpus ID: 168169824

Defending Against Neural Fake News

  title={Defending Against Neural Fake News},
  author={Rowan Zellers and Ari Holtzman and Hannah Rashkin and Yonatan Bisk and Ali Farhadi and Franziska Roesner and Yejin Choi},
Recent progress in natural language generation has raised dual-use concerns. [] Key Result We conclude by discussing ethical issues regarding the technology, and plan to release Grover publicly, helping pave the way for better detection of neural fake news.

Figures and Tables from this paper

Robustness Analysis of Grover for Machine-Generated News Detection

An investigation of Grover’s susceptibility to adversarial attacks such as character-level and word-level perturbations shows that even a singular character alteration can cause Grover to fail, exposing a lack of robustness.

Detecting Cross-Modal Inconsistency to Defend against Neural Fake News

A relatively effective approach based on detecting visual-semantic inconsistencies will serve as an effective first line of defense and a useful reference for future work in defending against machine-generated disinformation.

Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

A novel framework for generating articles closer to human-written ones is proposed, which performs self-critical sequence training with natural language inference to ensure the validity of the generated articles and explicitly incorporates propaganda techniques into thegenerated articles to mimic how humans craft fake news.

Viable Threat on News Reading: Generating Biased News Using Natural Language Models

A threat model is used to demonstrate that the publicly available language models can reliably generate biased news content based on an input original news and it is shown that a large number of high-quality biased news articles can be generated using controllable text generation.

MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models

Malcom, an end-to-end adversarial comment generation framework, is developed that can successfully mislead five of the latest neural detection models to always output targeted real and fake news labels.

The Limitations of Stylometry for Detecting Machine-Generated Fake News

Though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information, highlighting the need for non-stylometry approaches in detecting machine-generated misinformation.

Are We Safe Yet? The Limitations of Distributional Features for Fake News Detection

A fundamental problem with provenance-based approaches against attackers that auto-generate fake news is identified: fake and legitimate texts can originate from nearly identical sources.

Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformers

While statistical features underperform neural features, statistical features provide additional adversarial robustness that can be leveraged in ensemble detection models, and pioneer the usage of ∆MAUVE as a proxy measure for human judgement of adversarial text quality.

Deepfake Text Detection: Limitations and Opportunities

Evaluation of deepfake text from 4 online services powered by Transformer-based tools shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deep fake text detection schemes.

Synthetic Disinformation Attacks on Automated Fact Verification Systems

This work explores the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings: ADVERSARIAL ADDITION, where documents are fabricate and added to the evidence repository available to the fact-checking system, and ADVERSarIAL MODIFICATION, where existing evidence source documents in the repository are automatically altered.



"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection

This paper presents liar: a new, publicly available dataset for fake news detection, and designs a novel, hybrid convolutional neural network to integrate meta-data with text to improve a text-only deep learning model.

Language GANs Falling Short

The impact of exposure bias on sample quality is less severe than previously thought, and temperature tuning provides a better quality / diversity trade-off than adversarial training while being easier to train, easier to cross-validate, and less computationally expensive.

The Curious Case of Neural Text Degeneration

By sampling text from the dynamic nucleus of the probability distribution, which allows for diversity while effectively truncating the less reliable tail of the distribution, the resulting text better demonstrates the quality of human text, yielding enhanced diversity without sacrificing fluency and coherence.

HellaSwag: Can a Machine Really Finish Your Sentence?

The construction of HellaSwag, a new challenge dataset, and its resulting difficulty, sheds light on the inner workings of deep pretrained models, and suggests a new path forward for NLP research, in which benchmarks co-evolve with the evolving state-of-the-art in an adversarial way, so as to present ever-harder challenges.

Unifying Human and Statistical Evaluation for Natural Language Generation

This paper proposes a unified framework which evaluates both diversity and quality, based on the optimal error rate of predicting whether a sentence is human- or machine-generated, called HUSE, which is efficiently estimated by combining human and statistical evaluation.

Automatic Detection of Fake News

This paper introduces two novel datasets for the task of fake news detection, covering seven different news domains, and conducts a set of learning experiments to build accurate fake news detectors that can achieve accuracies of up to 76%.

Toward Controlled Generation of Text

A new neural generative model is proposed which combines variational auto-encoders and holistic attribute discriminators for effective imposition of semantic structures inGeneric generation and manipulation of text.

FEVER: a Large-scale Dataset for Fact Extraction and VERification

This paper introduces a new publicly available dataset for verification against textual sources, FEVER, which consists of 185,445 claims generated by altering sentences extracted from Wikipedia and subsequently verified without knowledge of the sentence they were derived from.

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks.

Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking

Experiments show that while media fact-checking remains to be an open research question, stylistic cues can help determine the truthfulness of text.