# Generalization for Adaptively-chosen Estimators via Stable Median

@inproceedings{Feldman2017GeneralizationFA, title={Generalization for Adaptively-chosen Estimators via Stable Median}, author={Vitaly Feldman and Thomas Steinke}, booktitle={Annual Conference Computational Learning Theory}, year={2017} }

Datasets are often reused to perform multiple statistical analyses in an adaptive way, in which each analysis may depend on the outcomes of previous analyses on the same dataset. Standard statistical guarantees do not account for these dependencies and little is known about how to provably avoid overfitting and false discovery in the adaptive setting. We consider a natural formalization of this problem in which the goal is to design an algorithm that, given a limited number of i.i.d.~samples…

## 26 Citations

### Calibrating Noise to Variance in Adaptive Data Analysis

- Computer ScienceCOLT
- 2018

It is demonstrated that a simple and natural algorithm based on adding noise scaled to the standard deviation of the query provides the notion of stability, which implies an algorithm that can answer statistical queries about the dataset with substantially improved accuracy guarantees for low-variance queries.

### Generalization in the Face of Adaptivity: A Bayesian Perspective

- Computer ScienceArXiv
- 2021

This paper shows explicitly that the harms of adaptivity come from the covariance between the behavior of future queries and a Bayes factorbased measure of how much information about the data sample was encoded in the responses given to past queries, and uses this intuition to introduce a new stability notion.

### Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis

- Computer ScienceAISTATS
- 2020

A framework for providing valid, instance-specific confidence intervals for point estimates that can be generated by heuristics that gives guarantees that are orders of magnitude better than the best worst-case bounds.

### The Limits of Post-Selection Generalization

- Computer Science, MathematicsNeurIPS
- 2018

A tight lower bound on the error of any algorithm that satisfies post hoc generalization and answers adaptively chosen statistical queries is shown, showing a strong barrier to progress in post selection data analysis.

### Privacy-preserving Prediction

- Computer ScienceCOLT
- 2018

A simple baseline approach based on training several models on disjoint subsets of data and using standard private aggregation techniques to predict has nearly optimal sample complexity for PAC learning of any class of Boolean functions and introduces a substantial overhead for the aggregation step.

### Mitigating Bias in Adaptive Data Gathering via Differential Privacy

- Computer ScienceICML
- 2018

This paper shows that there exist differentially private bandit algorithms with near optimal regret bounds, and applies existing theorems in the simple stochastic case, and gives a new analysis for linear contextual bandits.

### On the Robustness of CountSketch to Adaptive Inputs

- Computer Science, MathematicsICML
- 2022

A robust estimator is proposed (for a slightly modiﬁed sketch) that al-lows for quadratic number of queries in the sketch size, which is an improvement factor of √ k (for k heavy hitters) over prior "blackbox" approaches.

### Learning with User-Level Privacy

- Computer ScienceNeurIPS
- 2021

User-level DP protects a user’s entire contribution, providing more stringent but more realistic protection against information leaks, and shows that for high-dimensional mean estimation, empirical risk minimization with smooth losses, stochastic convex optimization, and learning hypothesis class with finite metric entropy, the privacy cost decreases as O(1/ m) as users provide more samples.

### The structure of optimal private tests for simple hypotheses

- Computer Science, MathematicsSTOC
- 2019

Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple…

### The Sparse Vector Technique, Revisited

- Computer ScienceCOLT
- 2021

An algorithm for the shifting-heavy-hitters problem is presented with improved error guarantees over what can be obtained using existing techniques, rather than the total number of times in which a heavy-hitter exists.

## References

SHOWING 1-10 OF 31 REFERENCES

### Algorithmic stability for adaptive data analysis

- Computer Science, MathematicsSTOC
- 2016

The first upper bounds on the number of samples required to answer more general families of queries, including arbitrary low-sensitivity queries and an important class of optimization queries (alternatively, risk minimization queries), are proved.

### Generalization in Adaptive Data Analysis and Holdout Reuse

- Computer ScienceNIPS
- 2015

A simple and practical method for reusing a holdout set to validate the accuracy of hypotheses produced by a learning algorithm operating on a training set and it is shown that a simple approach based on description length can also be used to give guarantees of statistical validity in adaptive settings.

### Privacy-preserving statistical estimation with optimal convergence rates

- Mathematics, Computer ScienceSTOC '11
- 2011

It is shown that for a large class of statistical estimators T and input distributions P, there is a differentially private estimator AT with the same asymptotic distribution as T, which implies that AT (X) is essentially as good as the original statistic T(X) for statistical inference, for sufficiently large samples.

### Preventing False Discovery in Interactive Data Analysis Is Hard

- Computer Science2014 IEEE 55th Annual Symposium on Foundations of Computer Science
- 2014

We show that, under a standard hardness assumption, there is no computationally efficient algorithm that given n samples from an unknown distribution can give valid answers to n3+o(1) adaptively…

### Efficient noise-tolerant learning from statistical queries

- Computer ScienceSTOC
- 1993

This paper formalizes a new but related model of learning from statistical queries, and demonstrates the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learning in the presence of noise).

### Typicality-Based Stability and Privacy

- Computer ScienceArXiv
- 2016

It is shown that if a typically stable interaction with a dataset yields a query from that class, then this query when evaluated on the same dataset will have small generalization error with high probability (i.e., it will not overfit to the dataset).

### Interactive fingerprinting codes and the hardness of preventing false discovery

- Computer Science2016 Information Theory and Applications Workshop (ITA)
- 2016

It is shown that, under a standard hardness assumption, there is no computationally efficient algorithm that, given n samples from an unknown distribution, can give valid answers to O(n2) adaptively chosen statistical queries.

### Max-Information, Differential Privacy, and Post-selection Hypothesis Testing

- Computer Science2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
- 2016

A principled study of how the generalization properties of approximate differential privacy can be used to perform adaptive hypothesis testing, while giving statistically valid p-value corrections, by observing that the guarantees of algorithms with bounded approximate max-information are sufficient to correct the p-values of adaptively chosen hypotheses.

### A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis

- Computer Science2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010

A new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen, and it is shown that when the input database is drawn from a smooth distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes poly-logarithmic in the data universe size.

### Controlling Bias in Adaptive Data Analysis Using Information Theory

- Computer ScienceAISTATS
- 2016

A general information-theoretic framework to quantify and provably bound the bias and other statistics of an arbitrary adaptive analysis process is proposed, and it is proved that the mutual information based bound is tight in natural models.