• Corpus ID: 244920662

Lattice-Based Methods Surpass Sum-of-Squares in Clustering

  title={Lattice-Based Methods Surpass Sum-of-Squares in Clustering},
  author={Ilias Zadik and Min Jae Song and Alexander S. Wein and Joan Bruna},
Clustering is a fundamental primitive in unsupervised learning which gives rise to a rich class of computationally-challenging inference tasks. In this work, we focus on the canonical task of clustering d-dimensional Gaussian mixtures with unknown (and possibly degenerate) covariance. Recent works (Ghosh et al. ’20; Mao, Wein ’21; Davis, Diaz, Wang ’21) have established lower bounds against the class of low-degree polynomial methods and the sum-of-squares (SoS) hierarchy for recovering certain… 
3 Citations

Tables from this paper

Non-Gaussian Component Analysis via Lattice Basis Reduction
A sample and computationally efficient algorithm for NGCA in the regime that A is discrete or nearly discrete, in a well-defined technical sense is obtained.
Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks
Superpolynomial statistical query lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model are given and a lifting procedure due to Daniely and Vardi is shown that reduces Boolean PAC learning problems toGaussian ones.
The Franz-Parisi Criterion and Computational Trade-offs in High Dimensional Statistics
. Many high-dimensional statistical inference problems are believed to possess inherent computational hardness. Various frameworks have been proposed to give rigorous evidence for such hardness,


Bayesian estimation from few samples: community detection and related problems
An efficient meta-algorithm for Bayesian estimation problems that is based on low-degree polynomials, semidefinite programming, and tensor decomposition, inspired by recent lower bound constructions for sum-of-squares and related to the method of moments is proposed.
The Power of Sum-of-Squares for Detecting Hidden Structures
It is proved that for a wide class of planted problems, including refuting random constraint satisfaction problems, tensor and sparse PCA, densest-ksubgraph, community detection in stochastic block models, planted clique, and others, eigenvalues of degree-d matrix polynomials are as powerful as SoS semidefinite programs of degree d.
Robust Discriminative Clustering with Sparse Regularizers
This paper shows that this joint clustering and dimension reduction formulation is equivalent to previously proposed discriminative clustering frameworks, thus leading to convex relaxations of the problem and proposes a novel sparse extension that allows estimation in higher dimensions.
Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure
This work introduces several new techniques to give a web of average-case reductions showing strong computational lower bounds based on the planted clique conjecture using natural problems as intermediates, including tight lower bounds for Planted Independent Set, Planted Dense Subgraph, Sparse Spiked Wigner, and Sparse PCA.
Statistical Query Lower Bounds for Robust Estimation of High-Dimensional Gaussians and Gaussian Mixtures
A general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems involving Gaussian distributions is described, which implies that the computational complexity of learning GMMs is inherently exponential in the dimension of the latent space even though there is no such information-theoretic barrier.
Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors
This work gives an algorithm with running time nearly linear in the input size that approximately recovers a planted sparse vector with up to constant relative sparsity in a random subspace of ℝn of dimension up to Ω(√n).
Settling the Polynomial Learnability of Mixtures of Gaussians
  • Ankur Moitra, G. Valiant
  • Computer Science
    2010 IEEE 51st Annual Symposium on Foundations of Computer Science
  • 2010
This paper gives the first polynomial time algorithm for proper density estimation for mixtures of k Gaussians that needs no assumptions on the mixture, and proves that such a dependence is necessary.
Non-Gaussian Component Analysis via Lattice Basis Reduction
A sample and computationally efficient algorithm for NGCA in the regime that A is discrete or nearly discrete, in a well-defined technical sense is obtained.
Statistical Algorithms and a Lower Bound for Detecting Planted Cliques
The main application is a nearly optimal lower bound on the complexity of any statistical query algorithm for detecting planted bipartite clique distributions when the planted clique has size O(n1/2 − δ) for any constant δ > 0.
CHIME: Clustering of high-dimensional Gaussian mixtures with EM algorithm and its optimality
This paper studies clustering of high-dimensional Gaussian mixtures and proposes a procedure, called CHIME, that is based on the EM algorithm and a direct estimation method for the sparse discriminant vector that outperforms the existing methods under a variety of settings.