• Corpus ID: 18989877

Methodological Challenges in Estimating Tone: Application to News Coverage of the U.S. Economy

  title={Methodological Challenges in Estimating Tone: Application to News Coverage of the U.S. Economy},
  author={Pablo Barber{\'a} and Amber E. Boydstun and Suzanna Linn and Ryan McMahon and Jonathan Nagler},
Machine learning methods have made possible the classification of large corpora of text by measures such as topic, tone, and ideology. However, even when using dictionary-based methods that require few inputs by the analyst beyond the text itself, many decisions must be made before a measure of any kind is produced from the text. When coding media the analyst must decide on the universe of media sources to sample from, as well as the criteria for selecting articles for coding from within that… 

Figures and Tables from this paper

Studying Political Decision Making With Automatic Text Analysis

The confluence of increasing availability of large digital text collections, plentiful computational power, and methodological innovations has lead to many researchers adopting techniques of automatic text analysis for coding and analyzing textual data.

Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment

Four best practices are summarized into four best practices: use a suitable sentiment dictionary; do not assume that the validity and reliability of the dictionary is ‘built-in’; check for the influence of content length and do not use multiple dictionaries to test the same statistical hypothesis.

Perspective Identification in Informal Text

A taxonomy for the most common community perspectives among Egyptians is developed and an iterative feedback-loop process to devise guidelines on how to successfully annotate a given online discussion forum post with different elements of a person's perspective is devised.

A Cross-National Analysis of the Causes and Consequences of Economic News

Objective. Work on economic news argues that U.S. coverage focuses primarily on changes rather than levels of future economic conditions; it also both affects and reflects public economic sentiment.

A Bad Workman Blames His Tweets The Consequences of Citizens ’ Uncivil Twitter Use when Interacting with Party Candidates

The recent emergence of microblogs has had a significant effect on the contemporary political landscape. The platform’s potential to enhance information availability and make interactive discussions

Extracting semantic relations using syntax

The rsyntax R package is introduced, which is designed to make working with dependency trees easier and more intuitive for R users, and provides a framework for combining multiple rules for reliably extracting useful semantic relations.

Toward an Aggregate, Implicit, and Dynamic Model of Norm Formation: Capturing Large-Scale Media Representations of Dynamic Descriptive Norms Through Automated and Crowdsourced Content Analysis.

This study describes tobacco and e-cigarette norm prevalence and trends over 37 months through an examination of a census of 135,764 long-form media texts, 12,262 popular YouTube videos, and 75,322,911 tweets.

Does newspaper coverage influence or reflect public perceptions of the economy?

Citizens’ economic perceptions can shape their political and economic behavior, making the origins of those perceptions an important question. Research commonly posits that media coverage is a

Do Longitudinal Trends in Tobacco 21-Related Media Coverage Correlate with Policy Support? an Exploratory Analysis Using Supervised and Unsupervised Machine Learning Methods

An exploratory content analysis to identify texts about Tobacco 21 in a large corpus of tobacco texts published in four popular media sources found that the prevalence of Tobacco 21 media coverage and Tobacco 21 support among young smokers exhibited similar temporal patterns for much of the study period.

News-driven business cycles: a narrative approach

This paper analyses the effects of technology news on the US business cycle. The paper suggests a new frequency-based index about the technology news from a major news outlet for the period 1948Q1 to



Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts

Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have

Affective News: The Automated Coding of Sentiment in Political Texts

The objective here is to outline and validate a new automated measurement instrument for sentiment analysis in political texts using a dictionary-based approach consisting of a simple word count of the frequency of keywords in a text from a predefined dictionary.

Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict

A variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words are discussed and several new approaches based on Bayesian shrinkage and regularization are introduced.

Computer‐Assisted Keyword and Document Set Discovery from Unstructured Text

A computer-assisted (as opposed to fully automated or human-only) statistical approach that suggests keywords from available text without needing structured data as inputs is developed, which leads to a widely applicable algorithm.

When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks

Previous research uses negative word counts to measure the tone of a text. We show that word lists developed for other disciplines misclassify common words in financial text. In a large sample of 10

Determining Economic News Coverage

While much research has been devoted to how individuals respond to media messages and frames, we know much less about what motivates variations in the content and tone of media coverage. Given the

RTextTools: A Supervised Learning Package for Text Classification

RTextTools was designed to make machine learning accessible by providing a start-to-finish product in less than 10 steps and can be used to train or cross-validate data.

Computer-Assisted Topic Classification for Mixed-Methods Social Science Research

This system maintains high classification accuracy and provides accurate estimates of document proportions, while achieving reliability levels associated with human efforts, and it is estimated that it lowers the costs of classifying large numbers of complex documents by 80% or more.

Consumer Sentiment, the Economy, and the News Media

The news media affects consumers' perceptions of the economy through three channels. First, the news media conveys the latest economic data and the opinions of professionals to consumers. Second,

The Toyota Recall Crisis: Media Impact on Toyota's Corporate Brand Reputation

The time trend of public opinion about carmaker Toyota dropped precipitously in early 2010 following a series of quality issues and recalls. The mathematical model of ideodynamics could predict the