# Robustly Learning any Clusterable Mixture of Gaussians

@article{Diakonikolas2020RobustlyLA, title={Robustly Learning any Clusterable Mixture of Gaussians}, author={Ilias Diakonikolas and Samuel B. Hopkins and Daniel M. Kane and Sushrut Karmalkar}, journal={ArXiv}, year={2020}, volume={abs/2005.06417} }

We study the efficient learnability of high-dimensional Gaussian mixtures in the outlier-robust setting, where a small constant fraction of the data is adversarially corrupted. We resolve the polynomial learnability of this problem when the components are pairwise separated in total variation distance. Specifically, we provide an algorithm that, for any constant number of components $k$, runs in polynomial time and learns the components of an $\epsilon$-corrupted $k$-mixture within informationâ€¦Â

## 29 Citations

### Outlier-Robust Clustering of Gaussians and Other Non-Spherical Mixtures

- Computer Science, Mathematics2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)
- 2020

The techniques expand the sum-of-squares toolkit to show robust certifiability of TV-separated Gaussian clusters in data, and give the first outlier-robust efficient algorithm for clustering a mixture of statistically separated Gaussians.

### Outlier-Robust Clustering of Non-Spherical Mixtures

- Computer Science, MathematicsArXiv
- 2020

The techniques expand the sum-of-squares toolkit to show robust certifiability of TV-separated Gaussian clusters in data and extend to clustering mixtures of arbitrary affine transforms of the uniform distribution on the d-dimensional unit sphere.

### Robust Sparse Mean Estimation via Sum of Squares

- Computer ScienceCOLT
- 2022

This work develops the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance for identity-covariance subgaussian distributions on R with "certifiably bounded" t-th moments and sufficiently light tails.

### Settling the robust learnability of mixtures of Gaussians

- Computer ScienceSTOC
- 2021

This work gives the first provably robust algorithm for learning mixtures of any constant number of Gaussians, a new method for proving dimension-independent polynomial identifiability through applying a carefully chosen sequence of differential operations to certain generating functions.

### Robustly learning mixtures of k arbitrary Gaussians

- Computer Science, MathematicsSTOC
- 2022

The main tools are an efficient partial clustering algorithm that relies on the sum-of-squares method, and a novel tensor decomposition algorithm that allows errors in both Frobenius norm and low-rank terms.

### Robust linear regression: optimal rates in polynomial time

- Mathematics, Computer ScienceSTOC
- 2021

The central technical contribution is to algorithmically exploit independence of random variables in the â€ťsum-of-squaresâ€ť framework by formulating it as the aforementioned polynomial inequality.

### A super-polynomial lower bound for learning nonparametric mixtures

- Mathematics, Computer ScienceArXiv
- 2022

A super-polynomial lower bound on the sample complexity of learning the component distributions in such models is established, and has important implications for the hardness of learning more general nonparametric latent variable models that arise in machine learning applications.

### Private Robust Estimation by Stabilizing Convex Relaxations

- Mathematics, Computer ScienceCOLT
- 2022

This work gives the first polynomial time and sample private robust estimation algorithm to estimate the mean, covariance and higher moments in the presence of a constant fraction of adversarial outliers and is the first efficient algorithm (even in the absence of outliers) that succeeds without any condition-number assumptions.

### Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination

- Computer Science2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)
- 2022

This work revisit two classic high-dimensional online learning problems, namely linear regression and contextual bandits, from the perspective of adversarial robustness, based on a novel alternating minimization scheme that interleaves ordinary least-squares with a simple convex program that finds the optimal reweighting of the distribution under a spectral constraint.

### Learning GMMs with Nearly Optimal Robustness Guarantees

- Computer ScienceCOLT
- 2022

In this work we solve the problem of robustly learning a high-dimensional Gaussian mixture model with k components from Ç«-corrupted samples up to accuracy Ă•(Ç«) in total variation distance for anyâ€¦

## References

SHOWING 1-10 OF 59 REFERENCES

### High-Dimensional Robust Mean Estimation in Nearly-Linear Time

- Computer Science, MathematicsSODA
- 2019

This work gives the first nearly-linear time algorithms for high-dimensional robust mean estimation on distributions with known covariance and sub-gaussian tails and unknown bounded covariance, and exploits the special structure of the corresponding SDPs to show that they are approximately solvable in nearly- linear time.

### Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

- Computer Science, MathematicsSODA
- 2018

This work gives robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension.

### Outlier-Robust Clustering of Non-Spherical Mixtures

- Computer Science, MathematicsArXiv
- 2020

The techniques expand the sum-of-squares toolkit to show robust certifiability of TV-separated Gaussian clusters in data and extend to clustering mixtures of arbitrary affine transforms of the uniform distribution on the d-dimensional unit sphere.

### Robust moment estimation and improved clustering via sum of squares

- Computer Science, MathematicsSTOC
- 2018

Improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers are developed and a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the Poincare inequality is shown.

### Robust Estimators in High Dimensions without the Computational Intractability

- Computer Science2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
- 2016

This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.

### Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians

- Computer Science, MathematicsCOLT
- 2014

An improved and generalized algorithm for selecting a good candidate distribution from among competing hypotheses, which improves previous such results from a quadratic dependence of the running time on $N$ to quasilinear.

### Efficient Algorithms and Lower Bounds for Robust Linear Regression

- Computer Science, MathematicsSODA
- 2019

Any polynomial time SQ learning algorithm for robust linear regression (in Huber's contamination model) with estimation complexity, must incur an error of $\Omega(\sqrt{\epsilon} \sigma)$.

### Settling the Polynomial Learnability of Mixtures of Gaussians

- Computer Science2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010

This paper gives the first polynomial time algorithm for proper density estimation for mixtures of k Gaussians that needs no assumptions on the mixture, and proves that such a dependence is necessary.

### Learning geometric concepts with nasty noise

- Computer ScienceSTOC
- 2018

The first polynomial-time PAC learning algorithms for low-degree PTFs and intersections of halfspaces with dimension-independent error guarantees in the presence of nasty noise under the Gaussian distribution are given.

### Efficiently learning mixtures of two Gaussians

- Computer Science, MathematicsSTOC '10
- 2010

This work provides a polynomial-time algorithm for this problem for the case of two Gaussians in $n$ dimensions (even if they overlap), with provably minimal assumptions on theGaussians, and polynometric data requirements, and efficiently performs near-optimal clustering.