• Corpus ID: 227238739

RealCause: Realistic Causal Inference Benchmarking

@article{Neal2020RealCauseRC,
  title={RealCause: Realistic Causal Inference Benchmarking},
  author={Brady Neal and Chin-Wei Huang and Sunand Raghupathi},
  journal={ArXiv},
  year={2020},
  volume={abs/2011.15007}
}
There are many different causal effect estimators in causal inference. However, it is unclear how to choose between these estimators because there is no ground-truth for causal effects. A commonly used option is to simulate synthetic data, where the ground-truth is known. However, the best causal estimators on synthetic data are unlikely to be the best causal estimators on realistic data. An ideal benchmark for causal estimators would both (a) yield ground-truth values of the causal effects and… 
Generating Synthetic Text Data to Evaluate Causal Inference Methods
TLDR
This work develops a framework for adapting existing generation models to produce synthetic text datasets with known causal effects and uses this framework to perform an empirical comparison of four recently-proposed methods for estimating causal effects from text data.
DoWhy: Addressing Challenges in Expressing and Validating Causal Assumptions
TLDR
DoWhy is a framework that allows explicit declaration of assumptions through a causal graph and provides multiple validation tests to check a subset of these assumptions, and highlights a number of open questions for future research.
ADCB: An Alzheimer's disease simulator for benchmarking observational estimators of causal effects
TLDR
A simulator of clinical variables associated with Alzheimer’s disease is developed to serve as a benchmark for causal effect estimation while modeling intricacies of healthcare data.
Synthetic Negative Controls: Using Simulation to Screen Large-scale Propensity Score Analyses
The propensity score has become a standard tool to control for large numbers of variables in healthcare database studies. However, little has been written on the challenge of comparing large-scale
Undersmoothing Causal Estimators with Generative Trees
Inferring individualised treatment effects from observational data can unlock the potential for targeted interventions. It is, however, hard to infer these effects from observational data. One major
Really Doing Great at Estimating CATE? A Critical Look at ML Benchmarking Practices in Treatment Effect Estimation
TLDR
This paper investigates current benchmarking practices for ML-based conditional average treatment effect (CATE) estimators, with special focus on empirical evaluation based on the popular semi-synthetic IHDP benchmark.
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects
TLDR
A simulator of Alzheimer's disease aimed at modeling intricacies of healthcare data while enabling benchmarking of causal effect and policy estimators is proposed and used to compare estimators of average and conditional treatment effects.
A Causal Approach to Prescriptive Process Monitoring
TLDR
The need for effective recommendations in process mining is addressed by devising new prescriptive process monitoring methods which are based on causal relationships, which will make recommendations, at tactical and operational levels, about what actions should be taken to achieve a given process objective.
A pragmatic approach to estimating average treatment effects from EHR data: the effect of prone positioning on mechanically ventilated COVID-19 patients
TLDR
A pragmatic methodology to obtain preliminary but robust estimation of treatment effect from observational studies, to provide front-line clinicians with a degree of confidence in their treatment strategy is proposed.
Doing Great at Estimating CATE? On the Neglected Assumptions in Benchmark Comparisons of Treatment Effect Estimators
TLDR
This paper considers two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators – the IHDP and ACIC2016 datasets – in detail and identifies problems with their current use and highlights that the inherent characteristics of the benchmark datasets favor some algorithms over others.
...
...

References

SHOWING 1-10 OF 56 REFERENCES
Synth-Validation: Selecting the Best Causal Inference Method for a Given Dataset
TLDR
This work proposes synth-validation, a procedure that estimates the estimation error of causal inference methods applied to a given dataset and applies each causal inference method to datasets sampled from these distributions and compares the effect estimates with the known effects to estimate error.
Benchmarking Framework for Performance-Evaluation of Causal Inference Analysis
TLDR
This work presents a comprehensive framework for benchmarking algorithms that estimate causal effect using data based on real-world covariates, and the treatment assignments and outcomes are based on simulations, which provides the basis for validation.
Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence
TLDR
An Empirical Monte Carlo Study that relies on arguably realistic data generation processes (DGPs) based on actual data to investigate the finite sample performance of causal machine learning estimators for heterogeneous causal effects at different aggregation levels.
Removing Hidden Confounding by Experimental Grounding
TLDR
This work introduces a novel method of using limited experimental data to correct the hidden confounding in causal effect models trained on larger observational data, even if the observational data does not fully overlap with the experimental data.
Statistics and Causal Inference
Abstract Problems involving causal inference have dogged at the heels of statistics since its earliest days. Correlation does not imply causation, and yet causal conclusions drawn from a carefully
Causal Effect Inference with Deep Latent-Variable Models
TLDR
This work builds on recent advances in latent variable modeling to simultaneously estimate the unknown latent space summarizing the confounders and the causal effect and shows its method is significantly more robust than existing methods, and matches the state-of-the-art on previous benchmarks focused on individual treatment effects.
An Evaluation Toolkit to Guide Model Selection and Cohort Definition in Causal Inference
TLDR
This work developed a toolkit that expands established machine learning evaluation methods and adds several causal-specific ones, and is agnostic to the machine learning model that is used.
Introduction to Causal Inference
  • P. Spirtes
  • Computer Science
    J. Mach. Learn. Res.
  • 2010
TLDR
This introduction to the Special Topic on Causality provides a brief introduction to graphical causal modeling, places the articles in a broader context, and describes the differences between causal inference and ordinary machine learning classification and prediction problems.
Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition
TLDR
The causal inference data analysis challenge, "Is Your SATT Where It's At?", launched as part of the 2016 Atlantic Causal Inference Conference, sought to make progress with respect to both the data testing grounds and the researchers submitting methods whose efficacy would be evaluated.
Estimating individual treatment effect: generalization bounds and algorithms
TLDR
A novel, simple and intuitive generalization-error bound is given showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalized-error of that representation and the distance between the treated and control distributions induced by the representation.
...
...