# Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods

@article{Tan2017PolynomialTA, title={Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods}, author={Yan Shuo Tan and Roman Vershynin}, journal={ArXiv}, year={2017}, volume={abs/1704.01041} }

The problem of Non-Gaussian Component Analysis (NGCA) is about finding a maximal low-dimensional subspace $E$ in $\mathbb{R}^n$ so that data points projected onto $E$ follow a non-gaussian distribution. Although this is an appropriate model for some real world data analysis problems, there has been little progress on this problem over the last decade.
In this paper, we attempt to address this state of affairs in two ways. First, we give a new characterization of standard gaussian distributions…

## 7 Citations

### Non-Gaussian Component Analysis via Lattice Basis Reduction

- Computer ScienceCOLT
- 2022

A sample and computationally computationally efficient algorithm for NGCA in the regime that A is discrete or nearly discrete, in a well-deﬁned technical sense is obtained.

### Non-Gaussian component analysis using entropy methods

- Computer Science, MathematicsSTOC
- 2019

An algorithm is given that takes polynomial time in the dimension n and has an inversePolynomial dependence on the error parameter measuring the angle distance between the non-Gaussian subspace and the subspace output by the algorithm.

### L G ] 4 N ov 2 01 8 Non-Gaussian Component Analysis using Entropy Methods

- Computer Science, Mathematics
- 2018

An algorithm is given that takes polynomial time in the dimension n and has an inversePolynomial dependence on the error parameter measuring the angle distance between the non-Gaussian subspace and the subspace output by the algorithm.

### Optimal Spectral Recovery of a Planted Vector in a Subspace

- Mathematics, Computer ScienceArXiv
- 2021

This work gives an improved analysis of a slight variant of the spectral method proposed by Hopkins, Schramm, Shi, and Steurer (2016), showing that it approximately recovers v with high probability in the regime nρ ≪ √ N .

### Some Algorithms and Paradigms for Big Data

- Computer Science
- 2018

This dissertation proves that the standard SDP relaxation of sparse PCA yields an algorithm that does signal recovery for sparse, model-misspecified phase retrieval with a sample complexity that scales according to the square of the sparsity parameter.

### Lattice-Based Methods Surpass Sum-of-Squares in Clustering

- Computer ScienceCOLT
- 2022

In this work we show that for an important case of the canonical clustering task of a d -dimensional Gaussian mixture with unknown (and possibly degenerate) covariance, a lattice-based…

### Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models

- Computer ScienceCOLT
- 2022

The optimal Statistical Query lower bounds for robustly learning certain families of discrete high-dimensional distributions are established, and a generic SQ lower bound is developed starting from low-dimensional moment matching constructions for discrete univariate distributions.

## References

SHOWING 1-10 OF 26 REFERENCES

### A new algorithm of non-Gaussian component analysis with radial kernel functions

- Computer Science
- 2007

An alternative algorithm called iterative metric adaptation for radial kernel functions (IMAK) is developed, which is theoretically better justifiable within the NGCA framework and tends to outperform existing methods through numerical examples.

### In Search of Non-Gaussian Components of a High-Dimensional Distribution

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2006

This article proposes a new linear method to identify the "non-Gaussian subspace" within a very general semi-parametric framework based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non- Gaussian target subspace, up to an estimation error.

### Joint low-rank approximation for extracting non-Gaussian subspaces

- Computer ScienceSignal Process.
- 2007

### Estimating Non-Gaussian Subspaces by Characteristic Functions

- Computer ScienceICA
- 2006

This article considers high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and proposes a new method using Hessian of characteristic functions which was applied to (multidimensional) independent component analysis.

### Fourier PCA and robust tensor decomposition

- Computer Science, MathematicsSTOC
- 2014

The main application is the first provably polynomial-time algorithm for underdetermined ICA, i.e., learning an n × m matrix A from observations y = Ax where x is drawn from an unknown product distribution with arbitrary non-Gaussian components.

### Structure from Local Optima: Learning Subspace Juntas via Higher Order PCA

- Computer ScienceArXiv
- 2011

A generalization of the well-known problem of learning k-juntas in R^n, and a novel tensor algorithm for unraveling the structure of high-dimensional distributions, which substantially generalizes existing results on learning low-dimensional concepts.

### Sparse Non-Gaussian Component Analysis

- Computer ScienceIEEE Transactions on Information Theory
- 2010

A new approach to NGCA is proposed called sparse NGCA which replaces the PCA-based procedure with a new the algorithm the authors refer to as convex projection.

### Fast and robust fixed-point algorithms for independent component analysis

- Computer Science, MathematicsIEEE Trans. Neural Networks
- 1999

Using maximum entropy approximations of differential entropy, a family of new contrast (objective) functions for ICA enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions.

### Isotropic PCA and Affine-Invariant Clustering

- Computer Science, Mathematics2008 49th Annual IEEE Symposium on Foundations of Computer Science
- 2008

An extension of principal component analysis (PCA) and a new algorithm for clustering points in \Rn based on it that is affine-invariant and nearly the best possible is presented, improving known results substantially.

### Introduction to the non-asymptotic analysis of random matrices

- Computer ScienceCompressed Sensing
- 2012

This is a tutorial on some basic non-asymptotic methods and concepts in random matrix theory, particularly for the problem of estimating covariance matrices in statistics and for validating probabilistic constructions of measurementMatrices in compressed sensing.