Phase transitions in sparse PCA

@article{Lesieur2015PhaseTI,
  title={Phase transitions in sparse PCA},
  author={Thibault Lesieur and Florent Krzakala and Lenka Zdeborov{\'a}},
  journal={2015 IEEE International Symposium on Information Theory (ISIT)},
  year={2015},
  pages={1635-1639}
}
We study optimal estimation for sparse principal component analysis when the number of non-zero elements is small but on the same order as the dimension of the data. We employ approximate message passing (AMP) algorithm and its state evolution to analyze what is the information theoretically minimal mean-squared error and the one achieved by AMP in the limit of large sizes. For a special case of rank one and large enough density of non-zeros Deshpande and Montanari [1] proved that AMP is… 
MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel
TLDR
The minimum mean squared error (MMSE) achievable information theoretically and with the AMP algorithm is characterized, and the corresponding approximate message passing (AMP) algorithm and its state evolution are derived.
Phase transitions in spiked matrix estimation: information-theoretic analysis
TLDR
The minimal mean squared error is computed for the estimation of the low-rank signal and it is compared to the performance of spectral estimators and message passing algorithms.
Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization
TLDR
The upper bounds show that for each of these problems there is a significant regime where reliable detection is information-theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative.
Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization
TLDR
The upper bounds show that for each of these problems there is a significant regime where reliable detection is information-theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative.
Rank-one matrix estimation: analysis of algorithmic and information theoretic limits by the spatial coupling method
TLDR
The spatial coupling methodology developed in the framework of error correcting codes is used, to rigorously derive the mutual information for the symmetric rank-one case and shows that the computational gap vanishes for the proposed spatially coupled model, a promising feature with many possible applications.
Mutual information in rank-one matrix estimation
TLDR
It is proved that the Bethe mutual information always yields an upper bound to the exact mutual information, using an interpolation method proposed by Guerra and later refined by Korada and Macris, in the case of rank-one symmetric matrix estimation.
Statistical and computational phase transitions in spiked tensor estimation
TLDR
The performance of Approximate Message Passing is studied and it is shown that it achieves the MMSE for a large set of parameters, and that factorization is algorithmically “easy” in a much wider region than previously believed.
Fundamental limits of symmetric low-rank matrix estimation
TLDR
This paper considers the high-dimensional inference problem where the signal is a low-rank symmetric matrix which is corrupted by an additive Gaussian noise and compute the limit in the large dimension setting for the mutual information between the signal and the observations, while the rank of the signal remains constant.
Optimality and Sub-optimality of PCA for Spiked Random Matrices and Synchronization
TLDR
The fundamental limitations of statistical methods are studied, including non-spectral ones, and it is shown that inefficient procedures can work below the threshold where PCA succeeds, whereas no known efficient algorithm achieves this.
Constrained Low-rank Matrix Estimation: Phase Transitions, Approximate Message Passing and Applications
TLDR
The derivation of the TAP equations for models as different as the Sherrington-Kirkpatrick model, the restricted Boltzmann machine, the Hopfield model or vector (xy, Heisenberg and other) spin glasses are unify.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 20 REFERENCES
Information-theoretically optimal sparse PCA
TLDR
This work analyzes an Approximate Message Passing algorithm to estimate the underlying signal and shows, in the high dimensional limit, that the AMP estimates are information-theoretically optimal and effectively provides a single-letter characterization of the sparse PCA problem.
Optimal Solutions for Sparse Principal Component Analysis
TLDR
A new semidefinite relaxation is formulated and a greedy algorithm is derived that computes a full set of good solutions for all target numbers of non zero coefficients, with total complexity O(n3), where n is the number of variables.
Phase Transitions and Sample Complexity in Bayes-Optimal Matrix Factorization
TLDR
This work compute the minimal mean-squared-error achievable, in principle, in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm based on the asymptotic state-evolution analysis of the algorithm.
Sparse PCA via Covariance Thresholding
TLDR
A covariance thresholding algorithm that was recently proposed by Krauthgamer, Nadler and Vilenchik is analyzed and it is rigorously proved that the algorithm succeeds with high probability for k of order √n.
Do Semidefinite Relaxations Really Solve Sparse PCA
Estimating the leading principal components of data assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were suggested for this sparse PCA problem, from
High-dimensional analysis of semidefinite relaxations for sparse principal components
  • A. Amini, M. Wainwright
  • Mathematics, Computer Science
    2008 IEEE International Symposium on Information Theory
  • 2008
TLDR
This paper analyzes a simple and computationally inexpensive diagonal cut-off method, and establishes a threshold of the order thetasdiag = n/[k2 log(p-k)] separating success from failure, and proves that a more complex semidefinite programming (SDP) relaxation due to dpsilaAspremont et al., succeeds once the sample size is of theorder thetassdp.
On Consistency and Sparsity for Principal Components Analysis in High Dimensions
  • I. Johnstone, A. Lu
  • Mathematics, Medicine
    Journal of the American Statistical Association
  • 2009
TLDR
A simple algorithm for selecting a subset of coordinates with largest sample variances is provided, and it is shown that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n.
DO SEMIDEFINITE RELAXATIONS SOLVE SPARSE PCA UP TO THE INFORMATION LIMIT
Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem,
Iterative estimation of constrained rank-one matrices in noise
  • S. Rangan, A. Fletcher
  • Computer Science, Mathematics
    2012 IEEE International Symposium on Information Theory Proceedings
  • 2012
TLDR
This work considers the problem of estimating a rank-one matrix in Gaussian noise under a probabilistic model for the left and right factors of the matrix and proposes a simple iterative procedure that reduces the problem to a sequence of scalar estimation computations.
Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices
AbstractWe compute the limiting distributions of the largest eigenvalue of a complex Gaussian samplecovariance matrix when both the number of samples and the number of variables in each samplebecome
...
1
2
...