Sparse Principal Component Analysis

@article{Zou2006SparsePC,
  title={Sparse Principal Component Analysis},
  author={Hui Zou and Trevor J. Hastie and Robert Tibshirani},
  journal={Journal of Computational and Graphical Statistics},
  year={2006},
  volume={15},
  pages={265 - 286}
}
Principal component analysis (PCA) is widely used in data processing and dimensionality reduction. However, PCA suffers from the fact that each principal component is a linear combination of all the original variables, thus it is often difficult to interpret the results. We introduce a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings. We first show that PCA can be formulated as a regression… 
Hierarchically penalized sparse principal component analysis
TLDR
A new PCA method is proposed to improve variable selection performance when variables are grouped, which not only selects important groups but also removes unimportant variables within identified groups to incorporate group information into model fitting.
Projection Sparse Principal Component Analysis: an efficient method for improving the interpretation of principal components
TLDR
This work proposes a practical SPCA method in which sparse components are computed by projecting the full principal components onto a subset of the variables and shows that these components explain more than a predetermined percentage of the variance explained by the principal components.
Stochastic convex sparse principal component analysis
TLDR
A convex sparse principal component analysis (Cvx-SPCA), which leverages a proximal variance reduced stochastic scheme to achieve a geometric convergence rate and it is shown that the convergence analysis can be significantly simplified by using a weak condition which allows a broader class of objectives to be applied.
Robust Sparse Principal Component Analysis
TLDR
The method is applied on several real data examples, and diagnostic plots for detecting outliers and for selecting the degree of sparsity are provided, and an algorithm to compute the sparse and robust principal components is proposed.
Sparse Principal Component Analysis Based on Least Trimmed Squares
TLDR
A robust sparsePCA method is proposed to handle potential outliers in the data based on the least trimmed squares PCA method which provides robust but non-sparse PC estimates and the computation time is reduced to a great extent.
Sparse Principal Component Analysis Incorporating Stability Selection
TLDR
This new approach is able to find sparse PCs that are linear combinations of subsets of variables selected with respect to Type I error control and will be compared with other sparse PCA approaches by a simulation study.
Sparse principal component regression with adaptive loading
Sparse Principal Component Analysis via Joint L2,1-Norm Penalty
TLDR
This work modifications the regression model by replacing the elastic net with L 2,1-norm, which encourages row-sparsity that can get rid of the same features in different PCs, and utilizes this new "self-contained" regression model to present a new framework for graph embedding methods, which can get sparse loadings via L 1,2-norm.
Sparse Principal Component Analysis: a Least Squares approximation approach
TLDR
This work derives sparse solutions with large loadings by adding a genuine sparsity requirement to the original Principal Components Analysis objective function and proposes a Branch-and-Bound search and an iterative elimination algorithm to identify the best subset of non-zero loadings.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
A Modified Principal Component Technique Based on the LASSO
In many multivariate statistical techniques, a set of linear functions of the original p variables is produced. One of the more difficult aspects of these techniques is the interpretation of the
Simple principal components
  • S. Vines
  • Mathematics, Computer Science
  • 2000
TLDR
An algorithm for producing simple approximate principal components directly from a variance–covariance matrix using a series of ‘simplicity preserving’ linear transformations that can always be represented by integers.
Principal Component Analysis
Introduction * Properties of Population Principal Components * Properties of Sample Principal Components * Interpreting Principal Components: Examples * Graphical Representation of Data Using
Regression Shrinkage and Selection via the Elastic Net , with Applications to Microarrays
TLDR
The elastic net is proposed, a new regression shrinkage and selection method that can be used to construct a classification rule and do automatic gene selection at the same time in microarray data, where the lasso is not very satisfied.
Regression Shrinkage and Selection via the Lasso
TLDR
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Regularization and variable selection via the elastic net
TLDR
It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Rotation of principal components: choice of normalization constraints
Following a principal component analysis, it is fairly common practice to rotate some of the components, often using orthogonal rotation. It is a frequent misconception that orthogonal rotation will
A new approach to variable selection in least squares problems
TLDR
A compact descent method for solving the constrained problem for a particular value of κ is formulated, and a homotopy method, in which the constraint bound κ becomes the Homotopy parameter, is developed to completely describe the possible selection regimes.
Interactive exploration of microarray gene expression patterns in a reduced dimensional space.
TLDR
In this study, PCA projection facilitated discriminatory gene selection for different tissues and identified tissue-specific gene expression signatures for liver, skeletal muscle, and brain samples.
Singular value decomposition for genome-wide expression data processing and modeling.
TLDR
Using singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype.
...
1
2
3
...