# Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference Using Five Empirical Applications

@article{Keele2018ComparingCP, title={Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference Using Five Empirical Applications}, author={Luke J. Keele and Dylan S. Small}, journal={The American Statistician}, year={2018}, volume={75}, pages={355 - 363} }

Abstract When investigators seek to estimate causal effects, they often assume that selection into treatment is based only on observed covariates. Under this identification strategy, analysts must adjust for observed confounders. While basic regression models have long been the dominant method of statistical adjustment, methods based on matching or weighting have become more common. Of late, methods based on machine learning (ML) have been developed for statistical adjustment. These ML methods…

## 12 Citations

### Confounder selection strategies targeting stable treatment effect estimators

- Economics, MathematicsStatistics in medicine
- 2020

The ability of the proposed confounder selection strategy to correctly select confounders, and to ensure valid inference of the treatment effect following data-driven covariate selection, is assessed empirically and compared with existing methods using simulation studies.

### Randomization Tests to Assess Covariate Balance When Designing and Analyzing Matched Datasets

- Computer ScienceObservational Studies
- 2021

Through simulation and a real application in political science, this work finds that matched datasets with high levels of covariate balance tend to approximate balance-constrained designs like rerandomization, and analyzing them as such can lead to precise causal analyses.

### All models are wrong, but which are useful? Comparing parametric and nonparametric estimation of causal effects in finite samples

- Mathematics, Computer Science
- 2022

A novel approach evaluating performance across thousands of data-generating mechanisms drawn from non-parametric models with semi-informative priors is proposed, and it is found that the nonparametric estimator nearly always outperform the parametric estimators with the exception of having similar performance in terms of bias and slightly worse performance under the smallest sample sizes.

### High Resolution Treatment Effects Estimation: Uncovering Effect Heterogeneities with the Modified Causal Forest

- EconomicsEntropy
- 2022

There is great demand for inferring causal effect heterogeneity and for open-source statistical software, which is readily available for practitioners. The mcf package is an open-source Python…

### Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences

- Computer ScienceScience advances
- 2022

This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines.

### Comparing the Performance of Statistical Adjustment Methods by Recovering the Experimental Benchmark from the REFLUX Trial

- EconomicsMedical decision making : an international journal of the Society for Medical Decision Making
- 2021

It is found that simple propensity score matching methods provide the least accurate estimates versus the RCT benchmark, and future studies should use multiple methods of estimation to fully represent uncertainty according to the choice of estimation approach.

### Innovations in Randomization Inference for the Design and Analysis of Experiments and Observational Studies

- Mathematics
- 2019

This dissertation proposes how to implement rerandomization in factorial experiments, extends the theoretical properties of re randomization from single-factor experiments to 2 factorial designs, and demonstrates how a designed experiment can improve precision of estimated factorial effects.

### A Survey of Causal Inference Frameworks

- Computer Science
- 2022

This survey aims to provide a review of the past work on causal inference, focusing mainly on potential outcomes framework and causal graphical models, to help accelerate the understanding of causal inference in diﬀerent domains.

### Comment: Will Competition-Winning Methods for Causal Inference Also Succeed in Practice?

- PsychologyStatistical Science
- 2019

First, we would like to congratulate the authors for successfully
hosting the causal inference data competition (referred to as Competition
henceforth) and contributing a unique and…

### Spatial and Spatiotemporal Matching Framework for Causal Inference (Short Paper)

- Environmental ScienceCOSIT
- 2022

Matching is a procedure aimed at reducing the impact of observational data bias in causal analysis. Designing matching methods for spatial data reflecting static spatial or dynamic spatio-temporal…

## References

SHOWING 1-10 OF 75 REFERENCES

### Optimizing matching and analysis combinations for estimating causal effects

- Economics, BusinessScientific reports
- 2016

Simulation results indicate that combining full matching with double robust analysis performed best in both the simulations and the applied example, particularly when combined with machine learning estimation methods.

### Bayesian Nonparametric Modeling for Causal Inference

- Economics
- 2011

Researchers have long struggled to identify causal effects in nonexperimental settings. Many recently proposed strategies assume ignorability of the treatment assignment mechanism and require fitting…

### Ensemble learning of inverse probability weights for marginal structural modeling in large observational datasets

- Environmental ScienceStatistics in medicine
- 2015

The application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V-fold cross validation, and an ensemble learner that creates a single partition of the data into training and validation sets are described.

### Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

- Mathematics, Computer ScienceJournal of the American Statistical Association
- 2018

This is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference and is found to be substantially more powerful than classical methods based on nearest-neighbor matching.

### Kernel Balancing: A Flexible Non-Parametric Weighting Procedure for Estimating Causal Effects

- Mathematics
- 2016

Methods such as matching and weighting for causal effect estimation attempt to adjust the joint distribution of observed covariates for treated and control units to be the same. However, they often…

### Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference

- Computer SciencePolitical Analysis
- 2007

A unified approach is proposed that makes it possible for researchers to preprocess data with matching and then to apply the best parametric techniques they would have used anyway and this procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.

### Semiparametric causal inference in matched cohort studies

- Mathematics, Economics
- 2015

Odds ratios can be estimated in case-control studies using standard logistic regression, ignoring the outcome-dependent sampling. In this paper we discuss an analogous result for treatment effects on…

### Double/Debiased Machine Learning for Treatment and Causal Parameters

- Computer Science
- 2017

This work can form an orthogonal score for the target low-dimensional parameter by combining auxiliary and main ML predictions, and build a de-biased estimator of the target parameter which typically will converge at the fastest possible 1/root(n) rate and be approximately unbiased and normal, and from which valid confidence intervals for these parameters of interest may be constructed.

### Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition

- Computer ScienceStatistical Science
- 2019

The causal inference data analysis challenge, "Is Your SATT Where It's At?", launched as part of the 2016 Atlantic Causal Inference Conference, sought to make progress with respect to both the data testing grounds and the researchers submitting methods whose efficacy would be evaluated.

### Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects

- Economics
- 2017

The Bayesian causal forest model permits treatment effect heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively "shrink to homogeneity".