# Estimating individual treatment effect: generalization bounds and algorithms

@inproceedings{Shalit2017EstimatingIT, title={Estimating individual treatment effect: generalization bounds and algorithms}, author={Uri Shalit and Fredrik D. Johansson and David A. Sontag}, booktitle={ICML}, year={2017} }

There is intense interest in applying machine learning to problems of causal inference in fields such as healthcare, economics and education. [...] Key Method We give a novel, simple and intuitive generalization-error bound showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalization-error of that representation and the distance between the treated and control distributions induced by the representation. Expand

## 410 Citations

Quantifying Error in the Presence of Confounders for Causal Inference

- Computer Science, MathematicsArXiv
- 2019

This work provides a new way to reason about competing estimators, and opens up the potential of deriving new methods by minimizing the proposed error bounds.

Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms

- Computer Science, MathematicsIEEE Journal of Selected Topics in Signal Processing
- 2018

An information-optimal Bayesian causal inference algorithm that embeds the potential outcomes in a vector-valued reproducing Kernel Hilbert space, and uses a multitask Gaussian process prior over that space to infer the individualized causal effects.

Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design

- Computer ScienceICML
- 2018

This paper characterizes the fundamental limits of estimating heterogeneous treatment effects, and establishes conditions under which these limits can be achieved, and builds a practical algorithm for estimating treatment effects using a non-stationary Gaussian processes with doubly-robust hyperparameters.

Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes

- Computer Science, MathematicsNIPS
- 2017

A novel multi- task learning framework in which factual and counterfactual outcomes are led as the outputs of a function in a vector-valued reproducing kernel Hilbert space (vvRKHS) and a nonparametric Bayesian method for learning the treatment effects using a multi-task Gaussian process (GP) with a linear coregion- alization kernel as a prior over the vvKHS is developed.

A Survey on Causal Inference

- Computer Science, MathematicsACM Trans. Knowl. Discov. Data
- 2021

This survey provides a comprehensive review of causal inference methods under the potential outcome framework, one of the well-known causal inference frameworks, and presents the plausible applications of these methods, including the applications in advertising, recommendation, medicine, and so on.

Quasi-oracle estimation of heterogeneous treatment effects

- Mathematics, Economics
- 2017

This paper develops a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies that have a quasi-oracle property, and implements variants of this approach based on penalized regression, kernel ridge regression, and boosting, and find promising performance relative to existing baselines.

Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding

- Mathematics, Computer ScienceAISTATS
- 2019

A functional interval estimator is developed that predicts bounds on the individual causal effects under realistic violations of unconfoundedness and is proved that it converges exactly to the tightest bounds possible on CATE when there may be unobserved confounders.

A Causal Dirichlet Mixture Model for Causal Inference from Observational Data

- Computer ScienceACM Trans. Intell. Syst. Technol.
- 2020

A novel prior called Causal DP is proposed and a model called CDP is designed to estimate various kinds of causal effects—average, conditional average, average treated, quantile, and so on and performs well with missing covariates and does not suffer from overfitting.

Causal Inference with Complex Data Structures and Non-Standard Effects

- Computer Science
- 2020

This thesis argues that incremental effects are much more efficient than conventional deterministic effects in a novel infinite time horizon setting, where the number of timepoints can grow to infinity, and gives a novel adaptation of unsupervised learning methods for analyzing treatment effect heterogeneity.

Validating Causal Inference Models via Influence Functions

- Computer ScienceICML
- 2019

This paper uses influence functions — the functional derivatives of a loss function — to develop a model validation procedure that estimates the estimation error of causal inference methods.

## References

SHOWING 1-10 OF 91 REFERENCES

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

- Computer Science, MathematicsJournal of the American Statistical Association
- 2018

This is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference and is found to be substantially more powerful than classical methods based on nearest-neighbor matching.

Double machine learning for treatment and causal parameters

- Mathematics
- 2016

Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically…

Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2016

Empirical results on real-world data indicate that certain methods are indeed able to distinguish cause from effect using only purely observational data, although more benchmark data would be needed to obtain statistically significant conclusions.

Inference on Treatment Effects after Selection Amongst High-Dimensional Controls

- Computer Science
- 2011

This work develops a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the "post-double-selection" method, which resolves the problem of uniform inference after model selection for a large, interesting class of models.

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions.

- Mathematics, Economics
- 2016

There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment…

Bounds on direct effects in the presence of confounded intermediate variables.

- Mathematics, MedicineBiometrics
- 2008

The symbolic Balke-Pearl linear programming method is applied to derive closed-form formulas for the upper and lower bounds on the ACDE under various assumptions of monotonicity to enable clinical experimenters to assess the direct effect of treatment from observed data with minimum computational effort.

Efficient Inference of Average Treatment Effects in High Dimensions via Approximate Residual Balancing

- Mathematics
- 2016

There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment…

A Nonparametric Bayesian Analysis of Heterogenous Treatment Effects in Digital Experimentation

- Computer Science, Mathematics
- 2014

A fast and scalable Bayesian nonparametric analysis of heterogenous treatment effects and their measurement in relation to observable covariates and it is argued that practitioners should look to ensembles of trees (forests) rather than individual trees in their analysis.

Bayesian Nonparametric Modeling for Causal Inference

- Mathematics
- 2011

Researchers have long struggled to identify causal effects in nonexperimental settings. Many recently proposed strategies assume ignorability of the treatment assignment mechanism and require fitting…

Recursive partitioning for heterogeneous causal effects

- Mathematics, EconomicsProceedings of the National Academy of Sciences
- 2016

This paper provides a data-driven approach to partition the data into subpopulations that differ in the magnitude of their treatment effects, and proposes an “honest” approach to estimation, whereby one sample is used to construct the partition and another to estimate treatment effects for each subpopulation.