Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking

  title={Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking},
  author={Hannah Rashkin and Eunsol Choi and Jin Yea Jang and Svitlana Volkova and Yejin Choi},
  booktitle={Conference on Empirical Methods in Natural Language Processing},
We present an analytic study on the language of news media in the context of political fact-checking and fake news detection. We compare the language of real news with that of satire, hoaxes, and propaganda to find linguistic characteristics of untrustworthy text. To probe the feasibility of automatic political fact-checking, we also present a case study based on using their factuality judgments on a 6-point scale. Experiments show that while media fact-checking remains to be an… 

Figures and Tables from this paper

Analyzing Political Bias and Unfairness in News Articles at Different Levels of Granularity

This paper utilizes a new corpus of news articles with labels derived from to develop a neural model for bias assessment, and finds insightful bias patterns at different levels of text granularity, from single words to the whole article discourse.

What Does Fake Look Like? A Review of the Literature on Intentional Deception in the News and on Social Media

ABSTRACT This paper focuses on the content features of intentional deceptive information in the news (i.e., fake news) and on social media. Based on an extensive review of relevant literature (i.e.,

Using Natural Language to Predict Bias and Factuality in Media with a Study on Rationalization

A system to classify media sources’ political bias and factuality levels by analyzing the language that gives fake news its contagious and damaging power is presented.

Fact-Checking, Fake News, Propaganda, and Media Bias: Truth Seeking in the Post-Truth Era

The tutorial will offer an overview of the broad and emerging research area of disinformation, with focus on the latest developments and research directions.

Linguistic Signals under Misinformation and Fact-Checking

It is found that linguistic signals in user comments vary significantly with the veracity of posts, e.g., more misinformation-awareness signals and extensive emoji and swear word usage with falser posts, and that these signals can help to detect misinformation.

Fact-Checking, Fake News, Propaganda, Media Bias, and the COVID-19 Infodemic

This work offers an overview of the emerging and inter-connected research areas of fact-checking, disinformation, "fake news'', propaganda, and media bias detection, and the ongoing COVID-19 infodemic.

MisInfoWars: A linguistic analysis of deceptive and credible news

This thesis will confirm that there exist sufficient textual differences between the articles of fake news and credible news to consider them distinct varieties and advocate for differentiation between disingenuous and respectable media based on linguistic variation.

Predicting Factuality of Reporting and Bias of News Media Sources

This work is interested in characterizing entire news media, an under-studied, but arguably important research problem, both in its own right and as a prior for fact-checking systems.

Identifying Fake News on Social Networks Based on Natural Language Processing: Trends and Challenges

Methods for preprocessing data in natural language, vectorization, dimensionality reduction, machine learning, and quality assessment of information retrieval are surveyed and contextualize the identification of fake news.

Curtailing Fake News Propagation with Psychographics

It is argued thatCurtailing fake news is better pursued by identifying its propagators than by classifying its content, and a system for curtailing the propagation of fake news on social media is developed by identifying users who are susceptible to believing and propagating it.



Deception detection for news: Three types of fakes

Three types of fake news are discussed, each in contrast to genuine serious reporting, and their pros and cons as a corpus for text analytics and predictive modeling are weighed.

The Effect of Fact-Checking on Elites: A Field Experiment on U.S. State Legislators

Does external monitoring improve democratic performance? Fact-checking has come to play an increasingly important role in political coverage in the United States, but some research suggests it may be

Lying Words: Predicting Deception from Linguistic Styles

The current project investigated the features of linguistic style that distinguish between true and false stories, and found that liars showed lower cognitive complexity, used fewer self-references and other- References, and used more negative emotion words than truth-tellers.

Belief Echoes: The Persistent Effects of Corrected Misinformation

Across three separate experiments, I find that exposure to negative political information continues to shape attitudes even after the information has been effectively discredited. I call these

Finding Deceptive Opinion Spam by Any Stretch of the Imagination

This work develops and compares three approaches to detecting deceptive opinion spam, and develops a classifier that is nearly 90% accurate on the authors' gold-standard opinion spam dataset, and reveals a relationship between deceptive opinions and imaginative writing.

Linguistic Models for Analyzing and Detecting Biased Language

The analysis of real instances of human edits designed to remove bias from Wikipedia articles uncovers two classes of bias: framing bias, such as praising or perspective-specific words, which is linked to the literature on subjectivity; and epistemological bias, related to whether propositions that are presupposed or entailed in the text are uncontroversially accepted as true.

Hedge Detection as a Lens on Framing in the GMO Debates: A Position Paper

A detailed approach to studying whether hedge detection can be used to understanding scientific framing in the GMO debates, and a preliminary analyses suggest that hedges occur less frequently in scientific discourse than in popular text.

Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications

The detection of deception is a promising but challenging task. A systematic discussion of automated Linguistics Based Cues (LBC) to deception has rarely been touched before. The experiment studied

“Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection

This paper presents LIAR: a new, publicly available dataset for fake news detection, and designs a novel, hybrid convolutional neural network to integrate meta-data with text to improve a text-only deep learning model.

Fact Checking: Task definition and dataset construction

The task of fact checking is introduced and the construction of a publicly available dataset using statements fact-checked by journalists available online is detailed, including baseline approaches for the task and the challenges that need to be addressed.