• Corpus ID: 16073783

Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods

@article{Tan2017PolynomialTA,
  title={Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods},
  author={Yan Shuo Tan and Roman Vershynin},
  journal={ArXiv},
  year={2017},
  volume={abs/1704.01041}
}
The problem of Non-Gaussian Component Analysis (NGCA) is about finding a maximal low-dimensional subspace $E$ in $\mathbb{R}^n$ so that data points projected onto $E$ follow a non-gaussian distribution. Although this is an appropriate model for some real world data analysis problems, there has been little progress on this problem over the last decade. In this paper, we attempt to address this state of affairs in two ways. First, we give a new characterization of standard gaussian distributions… 

Non-Gaussian Component Analysis via Lattice Basis Reduction

A sample and computationally computationally efficient algorithm for NGCA in the regime that A is discrete or nearly discrete, in a well-defined technical sense is obtained.

Non-Gaussian component analysis using entropy methods

An algorithm is given that takes polynomial time in the dimension n and has an inversePolynomial dependence on the error parameter measuring the angle distance between the non-Gaussian subspace and the subspace output by the algorithm.

L G ] 4 N ov 2 01 8 Non-Gaussian Component Analysis using Entropy Methods

An algorithm is given that takes polynomial time in the dimension n and has an inversePolynomial dependence on the error parameter measuring the angle distance between the non-Gaussian subspace and the subspace output by the algorithm.

Optimal Spectral Recovery of a Planted Vector in a Subspace

This work gives an improved analysis of a slight variant of the spectral method proposed by Hopkins, Schramm, Shi, and Steurer (2016), showing that it approximately recovers v with high probability in the regime nρ ≪ √ N .

Some Algorithms and Paradigms for Big Data

This dissertation proves that the standard SDP relaxation of sparse PCA yields an algorithm that does signal recovery for sparse, model-misspecified phase retrieval with a sample complexity that scales according to the square of the sparsity parameter.

Lattice-Based Methods Surpass Sum-of-Squares in Clustering

In this work we show that for an important case of the canonical clustering task of a d -dimensional Gaussian mixture with unknown (and possibly degenerate) covariance, a lattice-based

Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models

The optimal Statistical Query lower bounds for robustly learning certain families of discrete high-dimensional distributions are established, and a generic SQ lower bound is developed starting from low-dimensional moment matching constructions for discrete univariate distributions.

References

SHOWING 1-10 OF 26 REFERENCES

A new algorithm of non-Gaussian component analysis with radial kernel functions

An alternative algorithm called iterative metric adaptation for radial kernel functions (IMAK) is developed, which is theoretically better justifiable within the NGCA framework and tends to outperform existing methods through numerical examples.

In Search of Non-Gaussian Components of a High-Dimensional Distribution

This article proposes a new linear method to identify the "non-Gaussian subspace" within a very general semi-parametric framework based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non- Gaussian target subspace, up to an estimation error.

Joint low-rank approximation for extracting non-Gaussian subspaces

Estimating Non-Gaussian Subspaces by Characteristic Functions

This article considers high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and proposes a new method using Hessian of characteristic functions which was applied to (multidimensional) independent component analysis.

Fourier PCA and robust tensor decomposition

The main application is the first provably polynomial-time algorithm for underdetermined ICA, i.e., learning an n × m matrix A from observations y = Ax where x is drawn from an unknown product distribution with arbitrary non-Gaussian components.

Structure from Local Optima: Learning Subspace Juntas via Higher Order PCA

A generalization of the well-known problem of learning k-juntas in R^n, and a novel tensor algorithm for unraveling the structure of high-dimensional distributions, which substantially generalizes existing results on learning low-dimensional concepts.

Sparse Non-Gaussian Component Analysis

A new approach to NGCA is proposed called sparse NGCA which replaces the PCA-based procedure with a new the algorithm the authors refer to as convex projection.

Fast and robust fixed-point algorithms for independent component analysis

  • Aapo Hyvärinen
  • Computer Science, Mathematics
    IEEE Trans. Neural Networks
  • 1999
Using maximum entropy approximations of differential entropy, a family of new contrast (objective) functions for ICA enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions.

Isotropic PCA and Affine-Invariant Clustering

  • S. BrubakerS. Vempala
  • Computer Science, Mathematics
    2008 49th Annual IEEE Symposium on Foundations of Computer Science
  • 2008
An extension of principal component analysis (PCA) and a new algorithm for clustering points in \Rn based on it that is affine-invariant and nearly the best possible is presented, improving known results substantially.

Introduction to the non-asymptotic analysis of random matrices

This is a tutorial on some basic non-asymptotic methods and concepts in random matrix theory, particularly for the problem of estimating covariance matrices in statistics and for validating probabilistic constructions of measurementMatrices in compressed sensing.