• Corpus ID: 237304128

# Heavy-tailed Streaming Statistical Estimation

@article{Tsai2022HeavytailedSS,
title={Heavy-tailed Streaming Statistical Estimation},
journal={ArXiv},
year={2022},
volume={abs/2108.11483}
}
• Published 25 August 2021
• Computer Science, Mathematics
• ArXiv
We consider the task of heavy-tailed statistical estimation given streaming p dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional O ( p ) space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from…
3 Citations

## Figures and Tables from this paper

### Streaming Algorithms for High-Dimensional Robust Statistics

• Computer Science
ICML
• 2022
The main result is for the task of high-dimensional robust mean estimation in (a strengthening of) Huber’s contamination model, which gives an eﬃcient single-pass streaming algorithm with near-optimal error guarantees and space complexity nearly-linear in the dimension.

### Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance

• Computer Science, Mathematics
COLT
• 2022
This work quantifies the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometric parameters of the optimization problem.

### Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

• Computer Science
ICLR
• 2022
A variant of NCE is introduced called eNCE which uses an exponential loss and for which normalized gradient descent addresses the landscape issues provably when the target and noise distributions are in a given exponential family.

## References

SHOWING 1-10 OF 60 REFERENCES

### Mean Estimation and Regression Under Heavy-Tailed Distributions: A Survey

• Mathematics, Computer Science
Found. Comput. Math.
• 2019
This work describes sub-Gaussian mean estimators for possibly heavy-tailed data in both the univariate and multivariate settings and focuses on estimators based on median-of-means techniques, but other methods such as the trimmed-mean and Catoni's estimators are also reviewed.

### Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping

• Computer Science
NeurIPS
• 2020
The first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise are derived and derive for this method closing the gap in the theory of stochastic optimization with heavy-tailed noise.

### Simple and optimal high-probability bounds for strongly-convex stochastic gradient descent

• Computer Science, Mathematics
ArXiv
• 2019
A simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) is considered and it is proved that it achieves the optimal $O(1/T)$ convergence rate with high probability.

### Robust multivariate mean estimation: The optimality of trimmed mean

• Mathematics, Computer Science
The Annals of Statistics
• 2021
A multivariate extension of the trimmed-mean estimator is introduced and its optimal performance under minimal conditions is shown.

### Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection

• Computer Science
NeurIPS
• 2019
QUE-scoring, a new outlier scoring method based on quantum entropy regularization, is evaluated via extensive experiments on synthetic and real data, and it is demonstrated that it often performs better than previously proposed algorithms.

### Robust sub-Gaussian estimation of a mean vector in nearly linear time

• Computer Science, Mathematics
The Annals of Statistics
• 2022
The algorithm is fully data-dependent and does not use in its construction the proportion of outliers nor the rate above, which combines recently developed tools for Median-of-Means estimators and covering-Semi-definite Programming.

### Tight Analyses for Non-Smooth Stochastic Gradient Descent

• Computer Science, Mathematics
COLT
• 2019
It is proved that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability, and there exists a function from this class for which the errors of the last iterate of deterministic gradient descent is $\Omega(\log (T)/\sqrt{T})$.

### Risk minimization by median-of-means tournaments

• Mathematics, Computer Science
• 2016
A new procedure is introduced, the so-called median-of-means tournament, that achieves the optimal tradeoff between accuracy and confidence under minimal assumptions, and in particular outperforms classical methods based on empirical risk minimization.

### Geometric median and robust estimation in Banach spaces

In many real-world applications, collected data are contaminated by noise with heavy-tailed distribution and might contain outliers of large magnitude. In this situation, it is necessary to apply

### Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

• Computer Science
ICML
• 2012
This paper investigates the optimality of SGD in a stochastic setting, and shows that for smooth problems, the algorithm attains the optimal O(1/T) rate, however, for non-smooth problems the convergence rate with averaging might really be Ω(log(T)/T), and this is not just an artifact of the analysis.