Corpus ID: 235436003

CausalNLP: A Practical Toolkit for Causal Inference with Text

  title={CausalNLP: A Practical Toolkit for Causal Inference with Text},
  author={Arun S. Maiya},
Causal inference is the process of estimating the effect or impact of a treatment on an outcome with other covariates being treated as potential confounders (or mediators or suppressors) that may need to be controlled or balanced. The vast majority of existing methods and systems for causal inference assume that all variables under consideration are categorical or numerical (e.g., gender, price, blood pressure, enrollment). In this paper, we present CausalNLP, a toolkit for inferring causality… Expand

Tables from this paper


Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates
This review is the first to gather and categorize examples of potential confounders from observed text and provide a guide to data-processing and evaluation decisions. Expand
Challenges of Using Text Classifiers for Causal Inference
It is demonstrated how to conduct causal analyses using text classifiers on simulated and Yelp data, and the opportunities and challenges of future work that uses text data in causal inference are discussed. Expand
Causal Effects of Linguistic Properties
TextCause, an algorithm for estimating causal effects of linguistic properties, is introduced and it is shown that the proposed method outperforms related approaches when estimating the effect of Amazon review sentiment on semi-simulated sales figures. Expand
Adjusting for Confounding with Text Matching
A method of text matching, topical inverse regression matching, that allows the analyst to match both on the topical content of confounding documents and the probability that each of these documents is treated is proposed. Expand
Metalearners for estimating heterogeneous treatment effects using machine learning
A metalearner, the X-learner, is proposed, which can adapt to structural properties, such as the smoothness and sparsity of the underlying treatment effect, and is shown to be easy to use and to produce results that are interpretable. Expand
Discovery of Treatments from Text Corpora
A new experimental design and statistical model is introduced to simultaneously discover treatments in a corpora and estimate causal effects for these discovered treatments and the effects of these interventions in a test set of new texts and survey respondents. Expand
Estimating Causal Effects of Tone in Online Debates
The causal effect of reply tones in debates on linguistic and sentiment changes in subsequent responses is estimated and it is suggested that factual and asserting tones affect dialogue and provides a methodology for estimating causal effects from text. Expand
CausalML: Python Package for Causal Machine Learning
The key concepts, scope, and use cases of the causalML package are introduced, which tries to bridge the gap between theoretical work on methodology and practical applications by making a collection of methods in this field available in Python. Expand
Equivalence of the Mediation, Confounding and Suppression Effect
The statistical similarities among mediation, confounding, and suppression are described and methods to determine the confidence intervals for confounding and suppression effects are proposed based on methods developed for mediated effects. Expand
Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality
A framework for matching text documents is characterized that decomposes existing methods into (1) the choice of text representation and (2) thechoice of distance metric, and a predictive model is developed to estimate the match quality of pairs of text documents as a function of the authors' various distance scores. Expand