Robust Sparse Mean Estimation via Sum of Squares

@inproceedings{Diakonikolas2022RobustSM,
  title={Robust Sparse Mean Estimation via Sum of Squares},
  author={Ilias Diakonikolas and Daniel M. Kane and Sushrut Karmalkar and Ankit Pensia and Thanasis Pittas},
  booktitle={Annual Conference Computational Learning Theory},
  year={2022}
}
We study the problem of high-dimensional sparse mean estimation in the presence of an ε-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance subgaussian distributions. In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance. For distributions on R with “certifiably bounded” t-th moments and sufficiently light tails, our algorithm… 
3 Citations

List-Decodable Sparse Mean Estimation via Difference-of-Pairs Filtering

A novel, conceptually simpler technique for list-decodable mean estimation with optimal error guarantee for distributions with “certifiably bounded” t -th moments in k -sparse directions and sufficiently light tails and for Gaussian inliers.

Estimation Contracts for Outlier-Robust Geometric Perception

Conditions on the input measurements under which modern estimation algorithms are guaranteed to recover an estimate close to the ground truth in the presence of outliers are provided, what the authors call an “estimation contract”.

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

This work gives the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions and achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension.

References

SHOWING 1-10 OF 64 REFERENCES

High-Dimensional Robust Mean Estimation in Nearly-Linear Time

This work gives the first nearly-linear time algorithms for high-dimensional robust mean estimation on distributions with known covariance and sub-gaussian tails and unknown bounded covariance, and exploits the special structure of the corresponding SDPs to show that they are approximately solvable in nearly- linear time.

Outlier-robust moment-estimation via sum-of-squares

Improved algorithms for estimating low-degree moments of unknown distributions in the presence of adversarial outliers are developed and the guarantees of these algorithms match information-theoretic lower-bounds for the class of distributions the authors consider.

Efficient Algorithms for Outlier-Robust Regression

This work gives the first polynomial-time algorithm for performing linear orPolynomial regression resilient to adversarial corruptions in both examples and labels and gives a simple statistical lower bound showing that some distributional assumption is necessary to succeed in this setting.

Mixture models, robustness, and sum of squares proofs

We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve

Robust linear regression: optimal rates in polynomial time

The central technical contribution is to algorithmically exploit independence of random variables in the ”sum-of-squares” framework by formulating it as the aforementioned polynomial inequality.

Robust Estimators in High Dimensions without the Computational Intractability

This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.

Statistical Query Lower Bounds for Robust Estimation of High-Dimensional Gaussians and Gaussian Mixtures

A general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems involving Gaussian distributions is described, which implies that the computational complexity of learning GMMs is inherently exponential in the dimension of the latent space even though there is no such information-theoretic barrier.

Agnostic Estimation of Mean and Covariance

This work presents polynomial-time algorithms to estimate the mean and covariance of a distribution from i.i.d. samples in the presence of a fraction of malicious noise with error guarantees in terms of information-theoretic lower bounds.

List-decodable robust mean estimation and learning mixtures of spherical gaussians

The problem of list-decodable (robust) Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians are studied and a set of techniques that yield new efficient algorithms with significantly improved guarantees are developed.

Computationally Efficient Robust Sparse Estimation in High Dimensions

The theory identifies a unified set of deterministic conditions under which the algorithm guarantees accurate recovery of sparse functionals, and provides a novel algorithm based on the same intuition which is able to take advantage of further structure of the problem to achieve nearly optimal rates.
...