Parallel GPU Implementation of Iterative PCA Algorithms
@article{Andrecut2009ParallelGI, title={Parallel GPU Implementation of Iterative PCA Algorithms}, author={Mircea Andrecut}, journal={Journal of computational biology : a journal of computational molecular cell biology}, year={2009}, volume={16 11}, pages={ 1593-9 } }
Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. [] Key Method Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the…
Figures from this paper
113 Citations
Modified fast PCA algorithm on GPU architecture
- Computer ScienceProceedings of IEEE East-West Design & Test Symposium (EWDTS 2014)
- 2014
The modified version of fast PCA (MFPCA) algorithm is presented on the GPU architecture and the suitability of the algorithm for face recognition task is discussed and Experimental results show a decrease of the MFPCA algorithm execution time while preserving the quality of the results.
Accelerating a Geometrical Approximated PCA Algorithm Using AVX2 and CUDA
- Computer ScienceRemote. Sens.
- 2020
The experimental evaluation has shown not only the advantage of using CUDA programming in implementing the gaPCA algorithm on a GPU in terms of performance and energy consumption, but also significant benefits in implementing it on the multi-core CPU using AVX2 intrinsics.
Real-time PCA calculation for spectral imaging (using SIMD and GP-GPU)
- Computer ScienceJournal of Real-Time Image Processing
- 2010
Two optimized implementations of the PCA algorithm are presented, primarily targeted on spectral image analysis in real time, and one utilizes the SSE instruction set of contemporary CPUs, and the other one runs on graphics processors, using the CUDA environment.
Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures
- Computer Science
- 2010
This paper uses imaging spectrometer data to demonstrate the performance improvements attained by the implementation of PCA in GRASS GIS, which reduced runtime by nearly 99% using only multi-core related optimizations and an additional 50% reduction using GPU related optimizations.
A GPU parallel implementation of the Local Principal Component Analysis overcomplete method for DW image denoising
- Computer Science2016 IEEE Symposium on Computers and Communication (ISCC)
- 2016
This work designs and implements a parallel version of the OLPCA, by using a suitable mapping of the tasks on a GPU architecture with the aim of investigating the performance and the denoising features of the algorithm.
Performance Evaluation of Gradient-based Dimensionality Reduction Methods on Different Devices
- Computer Science2020 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT)
- 2020
This paper implemented the efficiency of the nonlinear mapping dimensionality reduction technique based on the stochastic and classic gradient descent algorithms using CUDA for NVIDIA GPU, HIP for AMD GPU, and OpenMP with AVX2 for CPU.
Implmentation of a covariance-based principal component analysis algorithm for hyperspectral imaging applications with multi-threading in both CPU and GPU
- Computer Science2012 IEEE International Geoscience and Remote Sensing Symposium
- 2012
An improvement which combines the multithreading in CPU, GPU and CUDA's graphics interoperability is presented and it is found that this combined framework approaches real-time processing much further.
A collaborative CPU-GPU approach for principal component analysis on mobile heterogeneous platforms
- Computer ScienceJ. Parallel Distributed Comput.
- 2018
Implementation of a covariance-based principal component analysis algorithm with a CUDA-enabled graphics processing unit
- Computer Science2011 IEEE International Geoscience and Remote Sensing Symposium
- 2011
It is found that the covariance-matrix approach has a great potential of reaching a real-time performance and compared the performance between them and their CPU counterparts.
Non-negative Matrix Factorization on GPU
- Computer ScienceNDT
- 2010
Computation of NMF on GPU using CUDA technology is introduced, which has main advantage in processing of non-negative matrix factorization which is easily interpretable as images, but other applications can be found in different areas as well.
References
SHOWING 1-3 OF 3 REFERENCES
Efficient Gram–Schmidt orthonormalisation on parallel computers
- Computer Science
- 2000
The paper shows how these algorithms can be implemented on a parallel computer, and how their communication overhead can be minimized, and provides some guidelines for selecting the most appropriate algorithm.
Loss and Recapture of Orthogonality in the Modified Gram-Schmidt Algorithm
- Computer ScienceSIAM J. Matrix Anal. Appl.
- 1992
The special structure of the product of the Householder transformations is derived, and then used to explain and bound the loss of orthogonality in MGS, which is illustrated by deriving a numerically stable algorithm based on MGS for a class of problems which includes solution of nonsingular linear systems.