Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Stefan Wager, S. Athey
- Mathematics, Computer ScienceJournal of the American Statistical Association
- 14 October 2015
This is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference and is found to be substantially more powerful than classical methods based on nearest-neighbor matching.
Generalized random forests
- S. Athey, J. Tibshirani, Stefan Wager
- Computer Science, MathematicsAnnals of Statistics
- 4 October 2016
A flexible, computationally efficient algorithm for growing generalized random forests, an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest, and an estimator for their asymptotic variance that enables valid confidence intervals are proposed.
Quasi-oracle estimation of heterogeneous treatment effects
- Xinkun Nie, Stefan Wager
- Computer Science, Mathematics
- 13 December 2017
This paper develops a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies that have a quasi-oracle property, and implements variants of this approach based on penalized regression, kernel ridge regression, and boosting, and find promising performance relative to existing baselines.
Approximate residual balancing: debiased inference of average treatment effects in high dimensions
- S. Athey, G. Imbens, Stefan Wager
- Economics, Computer Science
- 24 April 2016
A method for debiasing penalized regression adjustments to allow sparse regression methods like the lasso to be used for √n‐consistent inference of average treatment effects in high dimensional linear models.
Dropout Training as Adaptive Regularization
- Stefan Wager, Sida I. Wang, Percy Liang
- Computer ScienceNIPS
- 4 July 2013
By casting dropout as regularization, this work develops a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer and consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.
Efficient Policy Learning
- S. Athey, Stefan Wager
- Computer Science, EconomicsArXiv
- 9 February 2017
This paper derives lower bounds for the minimax regret of policy learning under constraints, and proposes a method that attains this bound asymptotically up to a constant factor, Whenever the class of policies under consideration has a bounded Vapnik-Chervonenkis dimension.
Policy Learning With Observational Data
- S. Athey, Stefan Wager
- Economics, Mathematics
- 9 February 2017
Given a doubly robust estimator of the causal effect of assigning everyone to treatment, an algorithm for choosing whom to treat is developed, and strong guarantees for the asymptotic utilitarian regret of the resulting policy are established.
High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification
- Edgar Dobriban, Stefan Wager
- Computer Science, Mathematics
- 10 July 2015
A unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model in a high-dimensional asymptotic regime and finds that predictive accuracy has a nuanced dependence on the eigenvalue distribution of the covariance matrix.
Estimating Treatment Effects with Causal Forests: An Application
- S. Athey, Stefan Wager
- PsychologyObservational Studies
- 20 February 2019
We apply causal forests to a dataset derived from the National Study of Learning Mindsets, and consider resulting practical and conceptual challenges. In particular, we discuss how causal forests use…
Confidence intervals for random forests: the jackknife and the infinitesimal jackknife
- Stefan Wager, T. Hastie, B. Efron
- MathematicsJournal of machine learning research
- 18 November 2013
The variability of predictions made by bagged learners and random forests are studied, and how to estimate standard errors for these methods are shown, and improved versions of jackknife and IJ estimators are proposed that only require B = Θ(n) replicates to converge.
...
...