Quantifying Heterogeneous Causal Treatment Effects in World Bank Development Finance Projects

  title={Quantifying Heterogeneous Causal Treatment Effects in World Bank Development Finance Projects},
  author={Jianing Zhao and Daniel M. Runfola and Peter Kemper},
The World Bank provides billions of dollars in development finance to countries across the world every year. As many projects are related to the environment, we want to understand the World Bank projects impact to forest cover. However, the global extent of these projects results in substantial heterogeneity in impacts due to geographic, cultural, and other factors. Recent research by Athey and Imbens has illustrated the potential for hybrid machine learning and causal inferential techniques… 
Exploring the Socioeconomic Co-benefits of Global Environment Facility Projects in Uganda Using a Quasi-Experimental Geospatial Interpolation (QGI) Approach
Since 1992, the Global Environment Facility (GEF) has mobilized over $131 billion in funds to enable developing and transitioning countries to meet the objectives of international environmental
A Primer on Geospatial Impact Evaluation Methods , Tools , and Applications
The growing availability of georeferenced data on development investments and outcomes has opened up new opportunities to understand what works, what doesn’t, and why at a substantially lower time
Tackling Climate Change with Machine Learning
From smart grids to disaster management, high impact problems where existing gaps can be filled by ML are identified, in collaboration with other fields, to join the global effort against climate change.
Predicting road quality using high resolution satellite imagery: A transfer learning approach
This piece adopts a transfer learning approach in which a convolutional neural network architecture is first trained on data collected in the United States, and then “fine-tuned” on an independent, smaller dataset collected from Nigeria, by leveraging satellite imagery to estimate road quality and concomitant information about travel speed.
Heterogeneous Treatment and Spillover Effects Under Clustered Network Interference
A machine learning method that makes use of tree-based algorithms and an Horvitz-Thompson estimator to assess the heterogeneity of treatment and spillover effects with respect to individual, neighborhood and network characteristics in the context of clustered network interference is developed.


Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
  • Stefan Wager, S. Athey
  • Mathematics, Computer Science
    Journal of the American Statistical Association
  • 2018
This is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference and is found to be substantially more powerful than classical methods based on nearest-neighbor matching.
Heterogeneous Treatment Effects in Digital Experimentation
A fast and scalable Bayesian nonparametric analysis of heterogeneity and its measurement in relation to observable covariates leads to a novel estimator of heterogeneity that is based around the distribution of covariates pooled across treatment groups.
Recursive partitioning for heterogeneous causal effects
This paper provides a data-driven approach to partition the data into subpopulations that differ in the magnitude of their treatment effects, and proposes an “honest” approach to estimation, whereby one sample is used to construct the partition and another to estimate treatment effects for each subpopulation.
Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score
It is shown that weighting with the inverse of a nonparametric estimate of the propensity Score, rather than the true propensity score, leads to efficient estimates of the various average treatment effects, whether the pre-treatment variables have discrete or continuous distributions.
Subgroup Analysis via Recursive Partitioning
Subgroup analysis is an integral part of comparative analysis where assessing the treatment effect on a response is of central interest. Its goal is to determine the heterogeneity of the treatment
Quantile Regression Forests
It is shown here that random forests provide information about the full conditional distribution of the response variable, not only about the conditional mean, in order to be competitive in terms of predictive power.
GUIDO IMBENS, DONALD RUBIN, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. New York: Cambridge University Press.
Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction is intended for a broad audience and succeeds in presenting material in a manner that is accessible to readers with a reasonable familiarity with mathematics and statistics.
Random Forests
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Confidence intervals for random forests: the jackknife and the infinitesimal jackknife
The variability of predictions made by bagged learners and random forests are studied, and how to estimate standard errors for these methods are shown, and improved versions of jackknife and IJ estimators are proposed that only require B = Θ(n) replicates to converge.
Regression Shrinkage and Selection via the Lasso
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.