The reusable holdout: Preserving validity in adaptive data analysis

@article{Dwork2015TheRH,
  title={The reusable holdout: Preserving validity in adaptive data analysis},
  author={Cynthia Dwork and Vitaly Feldman and Moritz Hardt and Toniann Pitassi and Omer Reingold and Aaron Roth},
  journal={Science},
  year={2015},
  volume={349},
  pages={636-638}
}
Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. Existing approaches to ensuring the validity of inferences drawn from data assume a fixed procedure to be performed, selected before the data are examined. In common practice, however, data analysis is an intrinsically adaptive process, with new analyses generated on the basis of data exploration, as well as the results of previous analyses on the same data. We demonstrate a new… CONTINUE READING
BETA

Topics from this paper.

Citations

Publications citing this paper.
SHOWING 1-10 OF 97 CITATIONS, ESTIMATED 25% COVERAGE

Final Program Report Theoretical Foundations of Big Data Analysis (Fall 2013)

Michael I. Jordan, Organizing Chair
  • 2018
VIEW 12 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Train on Validation: Squeezing the Data Lemon

VIEW 9 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Turning Lemons into Peaches using Secure Computation

VIEW 8 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Bridging Supervised Learning and Test-Based Co-optimization

  • Journal of Machine Learning Research
  • 2017
VIEW 3 EXCERPTS
CITES BACKGROUND
HIGHLY INFLUENCED

Leveraging Privacy in Data Analysis

VIEW 21 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Controlling Bias in Adaptive Data Analysis Using Information Theory

VIEW 8 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Mitigating Bias in Adaptive Data Gathering via Differential Privacy

VIEW 9 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Generalization for Adaptively-chosen Estimators via Stable Median

VIEW 8 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2013
2019

CITATION STATISTICS

  • 12 Highly Influenced Citations

  • Averaged 26 Citations per year over the last 3 years

  • 9% Increase in citations per year in 2018 over 2017

References

Publications referenced by this paper.
SHOWING 1-3 OF 3 REFERENCES

Trust in science would be improved by study pre-registration,

C. Chambers, M. Munafo
  • Guardian US,
  • 2013
VIEW 1 EXCERPT

Similar Papers

Loading similar papers…