Classifying Party Affiliation from Political Speech

  title={Classifying Party Affiliation from Political Speech},
  author={Bei Yu and Stefan Kaufmann and Daniel Diermeier},
  journal={Journal of Information Technology \& Politics},
  pages={33 - 48}
ABSTRACT In this article, we discuss the design of party classifiers for Congressional speech data. We then examine these party classifiers' person-dependency and time-dependency. We found that party classifiers trained on 2005 House speeches can be generalized to the Senate speeches of the same year, but not vice versa. The classifiers trained on 2005 House speeches performed better on Senate speeches from recent years than on older ones, which indicates the classifiers' time-dependency. This… 

Text to Ideology or Text to Party Status

A number of recent papers have used support-vector machines with word features to classify political texts — in particular, legislative speech — by ideology. Our own work on this topic led us to

Language and Ideology in Congress

Legislative speech records from the 101st to 108th Congresses of the US Senate are analysed to study political ideologies. A widely-used text classification algorithm – Support Vector Machines (SVM)

Computational Identification of Ideology in Text : A Study of Canadian Parliamentary Debates

In this study, we explore the task of classifying members of the 36th Canadian Parliament by ideology, which we approximate using party membership. Earlier work has been done on data from the U.S.

“ A Nation Divided ” : Classifying Presidential Speeches

Given the polarization of political parties, the variance of political views across states and the changing political climate over time in the United States, we ask how presidential candidates’

A Longitudinal Study of Language and Ideology in Congress

An analysis of the legislative speech records from the 101st-108th U.S. Congresses using machine learning and natural language processing methods provides evidence for a long-term increase in partisanship in both chambers with the House consistently more ideologically divided than the Senate.

Predicting political party affiliation from text

Every day a large amount of text is produced during public discourse. Some of this text is produced by actors whose political colour is very obvious. However, though many actors cannot clearly be

What’s in a Word? Detecting Partisan Affiliation from Word Use in Congressional Speeches

Very comparable results can be obtained using a much simpler linear classifier in word space, indicating that the use of words in partisan ways is not particularly complicated and that it has become steadily easier to infer partisan affiliation from political speeches in the United States.

A natural language measure of ideology in the Brazilian Senate

Abstract: We estimate a measure of political ideology using as data a corpus of over two decades of speeches delivered by Brazilian Federal Senators across five legislatures. We employ a

Analysis of speech transcripts to predict winners of U.S. Presidential and Vice-Presidential debates

Investigations into the speech used in American Presidential and Vice-Presidential debates are described and a set of surface-level features from historical debates are found to predict the winners of presidential debates with success moderately above chance.

Predicting Party Group from the Lithuanian Parliamentary Speeches Jurgita Kapočiūtė –

This research experimentally investigated the influence of different pre-processing techniques and feature types on two datasets composed of the texts taken from two parliamentary terms, finding a classifier based on the bag-of-words and token bigrams interpolation gave the best results.



Get out the vote: Determining support or opposition from Congressional floor-debate transcripts

It is found that the incorporation of sources of information regarding relationships between discourse segments, such as whether a given utterance indicates agreement with the opinion expressed by another, yields substantial improvements over classifying speeches in isolation.

Identifying and classifying subjective claims

This work identifies the main claim or assertion of the writer and classify it into the predefined classes of opinion (attitude) over the topic.

Extracting Policy Positions from Political Texts Using Words as Data

We present a new way of extracting policy positions from political texts that treats texts not as discourses to be understood and interpreted but rather, as data in the form of words. We compare this

An Automated Method of Topic-Coding Legislative Speech Over Time with Application to the 105th-108th U.S. Senate

A method for statistical learning from speech documents that is applied to the Congressional Record in order to gain new insight into the dynamics of the political agenda and can reveal speech topics that are both distinctive and inter-related in substantively meaningful ways is described.

Exploring the characteristics of opinion expressions for political opinion classification

Results suggest that recognizing the sentiment is not enough for political opinion classification, and what seems to be needed is a more fine-grained model of individuals' ideological positions and the different ways in which those positions manifest themselves in political discourse.

Congress: A Political-Economic History of Roll Call Voting

In this wide-ranging study, the authors use 200 years of congressional roll call voting as a framework for an interpretation of important episodes in American political and economic history. By

Thumbs up? Sentiment Classification using Machine Learning Techniques

This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.

Recounting the Courts? Toward A Text-Centered Computational Approach to Understanding the Dynamics of the Judicial System

This paper explores the potential uses of computational linguistics techniques for analyzing Supreme Court briefs and opinions. To do so, we focused on advocacy documents associated with the two

A comparison of event models for naive bayes text classification

It is found that the multi-variate Bernoulli performs well with small vocabulary sizes, but that the multinomial performs usually performs even better at larger vocabulary sizes--providing on average a 27% reduction in error over the multi -variateBernoulli model at any vocabulary size.

Machine learning in automated text categorization

This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.