# A Sparse SVD Method for High-dimensional Data

@article{Yang2011ASS, title={A Sparse SVD Method for High-dimensional Data}, author={Dan Yang and Zongming Ma and Andreas Buja}, journal={arXiv: Methodology}, year={2011} }

We present a new computational approach to approximating a large, noisy data table by a low-rank matrix with sparse singular vectors. The approximation is obtained from thresholded subspace iterations that produce the singular vectors simultaneously, rather than successively as in competing proposals. We introduce novel ways to estimate thresholding parameters which obviate the need for computationally expensive cross-validation. We also introduce a way to sparsely initialize the algorithm for…

## Figures and Tables from this paper

## 20 Citations

### A Sparse Singular Value Decomposition Method for High-Dimensional Data

- Computer Science
- 2014

A new computational approach to approximating a large, noisy data table by a low-rank matrix with sparse singular vectors is presented, obtained from thresholded subspace iterations that produce the singular vectors simultaneously, rather than successively as in competing proposals.

### A Simple and Provable Algorithm for Sparse Diagonal CCA

- Computer ScienceICML
- 2016

This work proposes a novel combinatorial algorithm for sparse diagonal CCA, i.e., sparse CCA under the additional assumption that variables within each set are standardized and uncorrelated, and can be straightforwardly adapted to other constrained variants of CCA enforcing structure beyond sparsity.

### Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates

- Computer ScienceJ. Mach. Learn. Res.
- 2021

Applying the general results to the specific settings yields the minimax rates of convergence for those problems, including the previous unknown optimal rates for non-negative PCA/SVD, sparse SVD and subspace constrained PCA-SVD.

### Rate Optimal Denoising of Simultaneously Sparse and Low Rank Matrices

- Computer ScienceJ. Mach. Learn. Res.
- 2016

It is shown that an iterative thresholding algorithm achieves (near) optimal rates adaptively under mild conditions for a large class of loss functions.

### Sparse and Functional Principal Components Analysis

- Computer Science2019 IEEE Data Science Workshop (DSW)
- 2019

This work proposes a unified approach to regularized PCA which can induce both sparsity and smoothness in both the row and column principal components, and generalizes much of the previous literature.

### Analysis of sparse PCA using high dimensional data

- Computer Science2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)
- 2016

Results attained showed that both PCA and Sparse PCA techniques are indeed suitable as feature extraction for high dimensional data since the accuracy rate attained are higher as compared to the original data as inputs to the classifier.

### Sparse CCA via Precision Adjusted Iterative Thresholding

- Computer Science
- 2013

An elementary sufficient and necessary characterization is introduced such that the solution of CCA is indeed sparse, a computationally efficient procedure is proposed, called CAPIT, to estimate the canonical directions, and it is shown that the procedure is rate-optimal under various assumptions on nuisance parameters.

### Minimax estimation in sparse canonical correlation analysis

- Mathematics
- 2015

Canonical correlation analysis is a widely used multivariate statistical technique for exploring the relation between two sets of variables. This paper considers the problem of estimating the leading…

### Optimal denoising of simultaneously sparse and low rank matrices in high dimensions

- Computer Science2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton)
- 2013

An algorithm is shown to achieve (near) optimal rates of convergence adaptively under mild conditions and minimax rates for denoising simultaneously sparse and low rank matrices in high dimensions are studied.

### Supervised Sparse and Functional Principal Component Analysis

- Computer Science
- 2016

This article proposes a supervised sparse and functional principal component (SupSFPC) framework that can incorporate supervision information to recover underlying structures that are more interpretable and develops an efficient modified expectation-maximization (EM) algorithm for parameter estimation.

## References

SHOWING 1-10 OF 35 REFERENCES

### Augmented sparse principal component analysis for high dimensional data

- Computer Science
- 2012

This work studies the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations and proposes an estimator based on a coordinate selection scheme combined with PCA that achieves the optimal rate of convergence under a sparsity regime.

### Sparse Principal Component Analysis and Iterative Thresholding

- Computer Science
- 2013

Under a spiked covariance model, a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse is proposed and it is found that the new approach recovers the principal subspace and leading eignevectors consistently, and even optimally, in a range of high-dimensional sparse settings.

### Convex Sparse Matrix Factorizations

- Computer ScienceArXiv
- 2008

This work presents a convex formulation of dictionary learning for sparse signal decomposition that introduces an explicit trade-off between size and sparsity of the decomposition of rectangular matrices and compares the estimation abilities of the convex and nonconvex approaches.

### Online Learning for Matrix Factorization and Sparse Coding

- Computer ScienceJ. Mach. Learn. Res.
- 2010

A new online optimization algorithm is proposed, based on stochastic approximations, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems.

### A Generalized Least Squares Matrix Decomposition

- Computer Science
- 2011

The Generalized least squares Matrix Decomposition (GMD), a generalization of the singular value decomposition (SVD) and principal components analysis (PCA) that is appropriate for massive data sets with structured variables or known two-way dependencies, is proposed.

### Reconstruction of a low-rank matrix in the presence of Gaussian noise

- MathematicsJ. Multivar. Anal.
- 2013

### Sparse principal component analysis via regularized low rank matrix approximation

- Computer Science
- 2008

### On Consistency and Sparsity for Principal Components Analysis in High Dimensions

- Computer Science, MathematicsJournal of the American Statistical Association
- 2009

A simple algorithm for selecting a subset of coordinates with largest sample variances is provided, and it is shown that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n.

### Model Averaging and Dimension Selection for the Singular Value Decomposition

- Mathematics
- 2006

Many multivariate data-analysis techniques for an m × n matrix Y are related to the model Y = M + E, where Y is an m × n matrix of full rank and M is an unobserved mean matrix of rank K < (m ∧ n).…

### A Framework for Feature Selection in Clustering

- Computer ScienceJournal of the American Statistical Association
- 2010

A novel framework for sparse clustering is proposed, in which one clusters the observations using an adaptively chosen subset of the features, which uses a lasso-type penalty to select the features.