• Publications
  • Influence
Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have
A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases
A statistical model is introduced that attends to the structure of political rhetoric when measuring expressed priorities: statements are naturally organized by author to simultaneously estimate the topics in the texts, as well as the attention political actors allocate to the estimated topics.
Are Close Elections Random
Elections with small margins of victory represent an important form of democratic competition and, increasingly, an opportunity for causal inference. When scholars use close elections for examining
Representational Style in Congress: What Legislators Say and Why It Matters
1. Representation inside and outside Congress 2. Representation and evaluation on the senator's terms 3. Measuring presentational styles with Senate press releases 4. Measuring presentational styles
General purpose computer-assisted clustering and conceptualization
This work develops a metric space of partitions from all existing cluster analysis methods applied to a given dataset and demonstrates that this approach facilitates more efficient and insightful discovery of useful information than expert human coders or many existing fully automated methods.
Appropriators not Position Takers: The Distorting Effects of Electoral Incentives on Congressional Representation
Congressional districts create two levels of representation. Studies of representation focus on a disaggregated level: the electoral connection between representatives and constituents. But there is
Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods
It is shown how an ensemble of methods—weighted averages of estimates from individual models increasingly used in machine learning—accurately measure heterogeneous effects and how pooling models lead to superior performance to individual methods across diverse problems.
How Words and Money Cultivate a Personal Vote: The Effect of Legislator Credit Claiming on Constituent Credit Allocation
Particularistic spending, a large literature argues, builds support for incumbents. This literature equates money spent in the district with the credit constituents allocate. Yet, constituents lack
How to Make Causal Inferences Using Texts
A conceptual framework for making causal inferences with discovered measures as a treatment or outcome is introduced and this framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes.