# Sparse PCA: Algorithms, Adversarial Perturbations and Certificates

@article{dOrsi2020SparsePA,
title={Sparse PCA: Algorithms, Adversarial Perturbations and Certificates},
author={Tommaso d'Orsi and Pravesh Kothari and Gleb Novikov and David Steurer},
journal={2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)},
year={2020},
pages={553-564}
}
• Published 1 November 2020
• Computer Science
• 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)
We study efficient algorithms for Sparse PCA in standard statistical models (spiked covariance in its Wishart form). Our goal is to achieve optimal recovery guarantees while being resilient to small perturbations. Despite a long history of prior works, including explicit studies of perturbation resilience, the best known algorithmic guarantees for Sparse PCA are fragile and break down under small adversarial perturbations. We observe a basic connection between perturbation resilience and…
13 Citations

## Figures and Tables from this paper

• Computer Science
ArXiv
• 2023
We introduce general tools for designing efficient private estimation algorithms, in the high-dimensional settings, whose statistical guarantees almost match those of the best known non-private
• Computer Science
ArXiv
• 2022
This work introduces a family of algorithms that under mild assumptions recover the signal 𝑋 ∗ in all estimation problems for which there exists a sum-of-squares algorithm that succeeds in recovering the signal when the noise N is Gaussian.
• Computer Science, Mathematics
ICML
• 2022
An algorithm for robust regression that achieves arbitrarily accurate additive error and uses runtime that closely matches the lower bound from the ﬁne-grained hardness result, as well as an algorithm for sparse regression with similar runtime that is inspired by the 3SUM problem.
• Computer Science, Mathematics
COLT
• 2022
It is shown that it is possible to efﬁciently certify whether a given n -by- d Gaussian matrix is well-spread if the number of observations is quadratic in the ambient dimension.
• Computer Science, Mathematics
SODA
• 2022
This setting provides an example where refutation is harder than search in the natural planted model and provides evidence for an algorithmic threshold for the problem at m & Õ(n) · n(1−δ)(D−1) for 2nδ-time algorithms for all δ.
• Mathematics, Computer Science
ArXiv
• 2021
This work gives an improved analysis of a slight variant of the spectral method proposed by Hopkins, Schramm, Shi, and Steurer (2016), showing that it approximately recovers v with high probability in the regime nρ ≪ √ N .
• Computer Science
ArXiv
• 2020
We present three provably accurate, polynomial time, approximation algorithms for the Sparse Principal Component Analysis (SPCA) problem, without imposing any restrictive assumptions on the input
• Mathematics, Computer Science
ArXiv
• 2020
In this paper, we construct general machinery for proving Sum-of-Squares lower bounds on certification problems by generalizing the techniques used by Barak et al. [FOCS 2016] to prove Sum-of-Squares
• Mathematics, Computer Science
Computational Complexity Conference
• 2021
It is proved that with high probability over the choice of a random graph from the Erdős-Rényi distribution (=, 1/2), a natural sum-of-squares semidefinite program cannot refute the existence of a valid :-coloring of for : =1/2+ .
• Computer Science
bioRxiv
• 2022
ThreSPCA is a provably accurate algorithm based on thresholding the Singular Value Decomposition for the SPCA problem, without imposing any restrictive assumptions on the input covariance matrix, and performs well in practice.

## References

SHOWING 1-10 OF 71 REFERENCES

• Computer Science
2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
• 2016
This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.
• Computer Science
NIPS
• 2016
This paper proposes the first quantitative analysis of the robustness of nonlinear classifiers in this general noise regime, and establishes precise theoretical bounds on the robustity of classifier's decision boundary, which depend on the curvature of the classifiers' decision boundary.
• Computer Science
COLT
• 2016
This work considers graphs generated according to the Stochastic Block Model and then modified by an adversary and gives robust recovery algorithms for partial recovery in SBM with modeling errors or noise and shows that these algorithms work not only when the instances come from SBM, but also work when the instance come from any distribution of graphs that is close to SBM in the Kullback---Leibler divergence.
• Computer Science, Mathematics
SODA
• 2021
A new hierarchy of semidefinite programming relaxations for inference problems, inspired by recent ideas of `pseudocalibration' in the Sum-of-Squares literature, is proposed, and it is shown that sufficiently high constant levels of this hierarchy can perform detection arbitrarily close to the Kesten-Stigum (KS) threshold.
• Computer Science, Mathematics
STOC
• 2016
This work gives an algorithm with running time nearly linear in the input size that approximately recovers a planted sparse vector with up to constant relative sparsity in a random subspace of ℝn of dimension up to Ω(√n).
• Computer Science
• 2015
It is proved that when the proposed SDP approach, at least in its standard usage, cannot recover the sparse spike, and empirical results suggesting that up to sparsity levels $k=O(\sqrt{n})$, recovery is possible by a simple covariance thresholding algorithm.
• Computer Science, Mathematics
2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
• 2017
It is proved that for a wide class of planted problems, including refuting random constraint satisfaction problems, tensor and sparse PCA, densest-ksubgraph, community detection in stochastic block models, planted clique, and others, eigenvalues of degree-d matrix polynomials are as powerful as SoS semidefinite programs of degree d.
• Computer Science
J. Mach. Learn. Res.
• 2016
A covariance thresholding algorithm that was recently proposed by Krauthgamer, Nadler and Vilenchik is analyzed and it is rigorously proved that the algorithm succeeds with high probability for k of order √n.
• Computer Science
NIPS
• 2015
It is proved that also degree-4 SoS algorithms cannot improve this quadratic gap, and this average-case lower bound adds to the small collection of hardness results in machine learning for this powerful family of convex relaxation algorithms.
• Computer Science, Mathematics
2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
• 2017
An efficient meta-algorithm for Bayesian inference problems based on low-degree polynomials, semidefinite programming, and tensor decomposition, inspired by recent lower bound constructions for sum-of-squares and related to the method of moments is proposed.