# CausalNLP: A Practical Toolkit for Causal Inference with Text

@article{Maiya2021CausalNLPAP, title={CausalNLP: A Practical Toolkit for Causal Inference with Text}, author={Arun S. Maiya}, journal={ArXiv}, year={2021}, volume={abs/2106.08043} }

Causal inference is the process of estimating the effect or impact of a treatment on an outcome with other covariates being treated as potential confounders (or mediators or suppressors) that may need to be controlled or balanced. The vast majority of existing methods and systems for causal inference assume that all variables under consideration are categorical or numerical (e.g., gender, price, blood pressure, enrollment). In this paper, we present CausalNLP, a toolkit for inferring causality… Expand

#### References

SHOWING 1-10 OF 22 REFERENCES

Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates

- Computer Science
- ACL
- 2020

This review is the first to gather and categorize examples of potential confounders from observed text and provide a guide to data-processing and evaluation decisions. Expand

Challenges of Using Text Classifiers for Causal Inference

- Computer Science, Medicine
- EMNLP
- 2018

It is demonstrated how to conduct causal analyses using text classifiers on simulated and Yelp data, and the opportunities and challenges of future work that uses text data in causal inference are discussed. Expand

Causal Effects of Linguistic Properties

- Computer Science
- NAACL
- 2021

TextCause, an algorithm for estimating causal effects of linguistic properties, is introduced and it is shown that the proposed method outperforms related approaches when estimating the effect of Amazon review sentiment on semi-simulated sales figures. Expand

Adjusting for Confounding with Text Matching

- Computer Science
- 2020

A method of text matching, topical inverse regression matching, that allows the analyst to match both on the topical content of confounding documents and the probability that each of these documents is treated is proposed. Expand

Metalearners for estimating heterogeneous treatment effects using machine learning

- Mathematics, Medicine
- Proceedings of the National Academy of Sciences
- 2019

A metalearner, the X-learner, is proposed, which can adapt to structural properties, such as the smoothness and sparsity of the underlying treatment effect, and is shown to be easy to use and to produce results that are interpretable. Expand

Discovery of Treatments from Text Corpora

- Computer Science
- ACL
- 2016

A new experimental design and statistical model is introduced to simultaneously discover treatments in a corpora and estimate causal effects for these discovered treatments and the effects of these interventions in a test set of new texts and survey respondents. Expand

Estimating Causal Effects of Tone in Online Debates

- Computer Science, Psychology
- IJCAI
- 2019

The causal effect of reply tones in debates on linguistic and sentiment changes in subsequent responses is estimated and it is suggested that factual and asserting tones affect dialogue and provides a methodology for estimating causal effects from text. Expand

CausalML: Python Package for Causal Machine Learning

- Computer Science, Mathematics
- ArXiv
- 2020

The key concepts, scope, and use cases of the causalML package are introduced, which tries to bridge the gap between theoretical work on methodology and practical applications by making a collection of methods in this field available in Python. Expand

Equivalence of the Mediation, Confounding and Suppression Effect

- Mathematics, Medicine
- Prevention Science
- 2004

The statistical similarities among mediation, confounding, and suppression are described and methods to determine the confidence intervals for confounding and suppression effects are proposed based on methods developed for mediated effects. Expand

Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality

- Computer Science, Mathematics
- Political Analysis
- 2020

A framework for matching text documents is characterized that decomposes existing methods into (1) the choice of text representation and (2) thechoice of distance metric, and a predictive model is developed to estimate the match quality of pairs of text documents as a function of the authors' various distance scores. Expand