# Machine Learning for Variance Reduction in Online Experiments

@inproceedings{Guo2021MachineLF, title={Machine Learning for Variance Reduction in Online Experiments}, author={Yongyi Guo and Dominic Coey and Mikael Konutgan and Wenting Li and Ch. P. Schoener and Matt Goldman}, booktitle={NeurIPS}, year={2021} }

We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is…

## 3 Citations

### Variance Reduction for Experiments with One-Sided Triggering using CUPED

- Mathematics
- 2021

In online experimentation, trigger-dilute analysis is an approach to obtain more precise estimates of intent-to-treat (ITT) effects when the intervention is only exposed, or "triggered", for a small…

### More Reviews May Not Help: Evidence from Incentivized First Reviews on Airbnb

- Economics
- 2021

Online reviews are typically written by volunteers and, as a consequence, information about seller quality may be under-provided in digital marketplaces. We study the extent of this under-provision…

### Do Incentives to Review Help the Market? Evidence from a Field Experiment on Airbnb

- 2022

Many online reputation systems operate by asking volunteers to write reviews for free. As a result, a large share of buyers do not review, and those who do review are self-selected. This can cause…

## References

SHOWING 1-10 OF 68 REFERENCES

### High-dimensional regression adjustments in randomized experiments

- Computer Science, EconomicsProceedings of the National Academy of Sciences
- 2016

This work studies the problem of treatment effect estimation in randomized experiments with high-dimensional covariate information and shows that essentially any risk-consistent regression adjustment can be used to obtain efficient estimates of the average treatment effect.

### No-harm calibration for generalized Oaxaca-Blinder estimators.

- Mathematics
- 2020

In randomized experiments, linear regression with baseline features can be used to form an estimate of the sample average treatment effect that is asymptotically no less efficient than the…

### Improving Treatment Effect Estimators Through Experiment Splitting

- MathematicsWWW
- 2019

Using a dataset of 226 Facebook News Feed A/B tests, it is shown that a lasso estimator based on repeated experiment splitting has a 44% lower mean squared predictive error than the conventional, unshrunk treatment effect estimator, and would lead to substantially improved launch decisions over both.

### Cross-fitting and fast remainder rates for semiparametric estimation

- Mathematics
- 2017

There are many interesting and widely used estimators of a functional with ?nite semi-parametric variance bound that depend on nonparametric estimators of nuisance func-tions. We use cross-?tting to…

### Quasi-oracle estimation of heterogeneous treatment effects

- Computer Science, Mathematics
- 2017

This paper develops a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies that have a quasi-oracle property, and implements variants of this approach based on penalized regression, kernel ridge regression, and boosting, and find promising performance relative to existing baselines.

### Generalized random forests

- Computer Science, MathematicsThe Annals of Statistics
- 2019

A flexible, computationally efficient algorithm for growing generalized random forests, an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest, and an estimator for their asymptotic variance that enables valid confidence intervals are proposed.

### Improving the sensitivity of online controlled experiments by utilizing pre-experiment data

- Computer ScienceWSDM '13
- 2013

This work proposes an approach (CUPED) that utilizes data from the pre-experiment period to reduce metric variability and hence achieve better sensitivity in experiments, applicable to a wide variety of key business metrics.

### Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique

- Economics
- 2012

Freedman [Adv. in Appl. Math. 40 (2008) 180-193; Ann. Appl. Stat. 2 (2008) 176-196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments,…

### Semiparametric theory and empirical processes in causal inference

- Mathematics, Economics
- 2016

In this paper we review important aspects of semiparametric theory and empirical processes that arise in causal inference problems. We begin with a brief introduction to the general problem of causal…

### Covariate adjustment for two‐sample treatment comparisons in randomized clinical trials: A principled yet flexible approach

- EconomicsStatistics in medicine
- 2008

Applying the theory of semiparametrics is led naturally to a characterization of all treatment effect estimators and to principled, practically feasible methods for covariate adjustment that yield the desired gains in efficiency and that allow covariate relationships to be identified and exploited while circumventing the usual concerns.