• Corpus ID: 246485879

Assessing External Validity Over Worst-case Subpopulations

@inproceedings{Jeong2020AssessingEV,
  title={Assessing External Validity Over Worst-case Subpopulations},
  author={Sookyo Jeong and Hongseok Namkoong},
  year={2020}
}
Study populations are typically sampled from limited points in space and time, and marginalized groups are underrepresented. To assess the external validity of randomized and observational studies, we propose and evaluate the worst-case treatment effect (WTE) across all subpopulations of a given size, which guarantees positive findings remain valid over subpopulations. We develop a semiparametrically efficient estimator for the WTE that analyzes the external validity of the augmented inverse… 
Causal inference methods for combining randomized trials and observational studies: a review
TLDR
This paper first discusses identification and estimation methods that improve generalizability of randomized controlled trials (RCTs) using the representativeness of observational data, and methods that combining RCTs and observational data to improve the (conditional) average treatment effect estimation.

References

SHOWING 1-10 OF 79 REFERENCES
Weighting for External Validity
External validity is a fundamental challenge in treatment effect estimation. Even when researchers credibly identify average treatment effects – for example through randomized experiments – the
Improving Generalizations From Experiments Using Propensity Score Subclassification
As a result of the use of random assignment to treatment, randomized experiments typically have high internal validity. However, units are very rarely randomly selected from a well-defined population
Assessing Treatment Effect Variation in Observational Studies: Results from a Data Challenge
A growing number of methods aim to assess the challenging question of treatment effect variation in observational studies. This special section of "Observational Studies" reports the results of a
Generalizing Study Results: A Potential Outcomes Perspective.
TLDR
This work discusses how a version of direct standardization (the g-formula, adjustment formula, or transport formula) or inverse probability weighting can be used to generalize a causal effect from a study sample to a well-defined target population, and demonstrates their application in an illustrative example.
How Generalizable Is Your Experiment? An Index for Comparing Experimental Samples and Populations
Although a large-scale experiment can provide an estimate of the average causal impact for a program, the sample of sites included in the experiment is often not drawn randomly from the inference
A Design-Based Approach to Improve External Validity in Welfare Policy Evaluations
TLDR
It is found that simply developing a population frame can be challenging, with three possible and reasonable options arising in the welfare policy arena, and a balanced sample strategic site selection method might be implemented in a welfare policy evaluation.
Extending inferences from a randomized trial to a target population
TLDR
This work presents simple methods for sensitivity analyses that directly parameterize violations of ``generalizability'' or ``transportability'' assumptions using bias functions, and illustrates the methods using data from a clinical trial comparing treatments for chronic hepatitis C infection.
Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
  • Stefan Wager, S. Athey
  • Mathematics, Computer Science
    Journal of the American Statistical Association
  • 2018
TLDR
This is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference and is found to be substantially more powerful than classical methods based on nearest-neighbor matching.
Understanding the Average Impact of Microcredit Expansions: A Bayesian Hierarchical Analysis of Seven Randomized Experiments
  • Rachael Meager
  • Economics
    American Economic Journal: Applied Economics
  • 2019
TLDR
The average effect and the heterogeneity in effects across seven studies using Bayesian hierarchical models are estimated and reasonable external validity is found: true heterogeneity in results is moderate, and approximately 60 percent of observed heterogeneity is sampling variation.
From Local to Global: External Validity in a Fertility Natural Experiment
Abstract We study issues related to external validity for treatment effects using over 100 replications of the Angrist and Evans natural experiment on the effects of sibling sex composition on
...
...