Principal Component Analysis

  title={Principal Component Analysis},
  author={Felipe L. Gewers and Gustavo R. Ferreira and Henrique Ferraz de Arruda and Filipi Nascimento Silva and C{\'e}sar Henrique Comin and Diego Raphael Amancio and Luciano da Fontoura Costa},
  journal={ACM Computing Surveys (CSUR)},
  pages={1 - 34}
Principal component analysis (PCA) is often applied for analyzing data in the most diverse areas. This work reports, in an accessible and integrated manner, several theoretical and practical aspects of PCA. The basic principles underlying PCA, data standardization, possible visualizations of the PCA results, and outlier detection are subsequently addressed. Next, the potential of using PCA for dimensionality reduction is illustrated on several real-world datasets. Finally, we summarize PCA… 
How are scientific works viewed?
This research was initiated in observations that the view profiles along time tend to present a piecewise linear nature, and found that models incorporating joint dependencies between the properties of the segments provided the most accurate results among the considered alternatives.
Coincidence complex networks
It is shown that the two proposed real-valued approaches can lead to enhanced performance when compared to cosine and Pearson correlation approaches, yielding a detailed description of the specific patterns of connectivity between the nodes, with enhanced modularity.
Unsupervised mapping of a hybrid urban area in South Africa
A classification strategy that gives the analyst control of 60% of the parameters to ensure an acceptable segmentation outcome and a feature selection approach that eliminates feature overlaps within the feature space which may not be observable within the original data are proposed.
Automated Machine Learning using Evolutionary Algorithms
  • M. Anton
  • Computer Science
    2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)
  • 2020
A novel approach to Automated Machine Learning using Evolutionary Algorithms is provided and proves its performance by presenting top results in benchmark tests.
Development and application of a supervised pattern recognition algorithm for identification of fuel-specific emissions profiles
Abstract. Wildfires have increased in frequency and intensity in the western United States (US) over the past decades, with negative consequences for air quality. Wildfires emit large quantities of
Harmonic Complex Networks
The musical interpretations of the results include the confirmation of the more regular consonance pattern of the equal temperament, obtained at the expense of a wider range of consonances such as that obtained in the meantone temperament.


Sparse Principal Component Analysis
This work introduces a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings and shows that PCA can be formulated as a regression-type optimization problem.
Probabilistic Principal Component Analysis
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based on a probability model. We demonstrate how the principal axes of a set of
Principal component analysis - a tutorial
  • A. Tharwat
  • Computer Science
    Int. J. Appl. Pattern Recognit.
  • 2016
Basic definitions of the PCA technique and the algorithms of two methods of calculating PCA, namely, the covariance matrix and Singular Value Decomposition (SVD) methods are started.
On Consistency and Sparsity for Principal Components Analysis in High Dimensions
  • I. Johnstone, A. Lu
  • Computer Science, Mathematics
    Journal of the American Statistical Association
  • 2009
A simple algorithm for selecting a subset of coordinates with largest sample variances is provided, and it is shown that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n.
Bayesian PCA
This paper uses probabilistic reformulation as the basis for a Bayesian treatment of PCA to show that effective dimensionality of the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure.
Control Procedures for Residuals Associated With Principal Component Analysis
This paper is concerned with the treatment of residuals associated with principal component analysis. These residuals are the difference between the original observations and the predictions of them
Dimension Reduction
  • Sushant Sachdeva, Xiao Shi
  • Computer Science
    Encyclopedia of GIS
  • 2008
This lecture covers the Johnson-Lindenstrauss Lemma and how to preserve distance information in data and common techniques include Singular Value Decomposition (SVD).
Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?
This paper reviews the main approaches to online PCA, namely, perturbation techniques, incremental methods and stochastic optimisation, and compares the most widely employed techniques in terms statistical accuracy, computation time and memory requirements using artificial and real data.
Robust statistics for outlier detection
An overview of several robust methods and outlier detection tools for univariate, low‐dimensional, and high‐dimensional data such as estimation of location and scatter, linear regression, principal component analysis, and classification are presented.
Kernel PCA and De-Noising in Feature Spaces
This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre- images in data reconstruction and de-noising on toy examples as well as on real world data.