Parallel GPU Implementation of Iterative PCA Algorithms

@article{Andrecut2009ParallelGI,
  title={Parallel GPU Implementation of Iterative PCA Algorithms},
  author={Mircea Andrecut},
  journal={Journal of computational biology : a journal of computational molecular cell biology},
  year={2009},
  volume={16 11},
  pages={
          1593-9
        }
}
  • M. Andrecut
  • Published 7 November 2008
  • Computer Science
  • Journal of computational biology : a journal of computational molecular cell biology
Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. [] Key Method Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the…

Figures from this paper

Modified fast PCA algorithm on GPU architecture

The modified version of fast PCA (MFPCA) algorithm is presented on the GPU architecture and the suitability of the algorithm for face recognition task is discussed and Experimental results show a decrease of the MFPCA algorithm execution time while preserving the quality of the results.

Accelerating a Geometrical Approximated PCA Algorithm Using AVX2 and CUDA

The experimental evaluation has shown not only the advantage of using CUDA programming in implementing the gaPCA algorithm on a GPU in terms of performance and energy consumption, but also significant benefits in implementing it on the multi-core CPU using AVX2 intrinsics.

Application of the OpenCL API for Implementation of the NIPALS Algorithm for Principal Component Analysis of Large Data Sets

  • J. C. Bowden
  • Computer Science
    2010 Sixth IEEE International Conference on e-Science Workshops
  • 2010
An implementation of the nonlinear iterative partial least squares algorithm (NIPALS) was used as a test case for use of OpenCL for computation on a general purpose graphics processing unit (GPGPU)

Real-time PCA calculation for spectral imaging (using SIMD and GP-GPU)

Two optimized implementations of the PCA algorithm are presented, primarily targeted on spectral image analysis in real time, and one utilizes the SSE instruction set of contemporary CPUs, and the other one runs on graphics processors, using the CUDA environment.

Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures

This paper uses imaging spectrometer data to demonstrate the performance improvements attained by the implementation of PCA in GRASS GIS, which reduced runtime by nearly 99% using only multi-core related optimizations and an additional 50% reduction using GPU related optimizations.

A GPU parallel implementation of the Local Principal Component Analysis overcomplete method for DW image denoising

This work designs and implements a parallel version of the OLPCA, by using a suitable mapping of the tasks on a GPU architecture with the aim of investigating the performance and the denoising features of the algorithm.

Performance Evaluation of Gradient-based Dimensionality Reduction Methods on Different Devices

  • A. BorisovE. Myasnikov
  • Computer Science
    2020 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT)
  • 2020
This paper implemented the efficiency of the nonlinear mapping dimensionality reduction technique based on the stochastic and classic gradient descent algorithms using CUDA for NVIDIA GPU, HIP for AMD GPU, and OpenMP with AVX2 for CPU.

Implmentation of a covariance-based principal component analysis algorithm for hyperspectral imaging applications with multi-threading in both CPU and GPU

  • Jian ZhangKim Hwa Lim
  • Computer Science
    2012 IEEE International Geoscience and Remote Sensing Symposium
  • 2012
An improvement which combines the multithreading in CPU, GPU and CUDA's graphics interoperability is presented and it is found that this combined framework approaches real-time processing much further.

Implementation of a covariance-based principal component analysis algorithm with a CUDA-enabled graphics processing unit

  • Jian ZhangKim Hwa Lim
  • Computer Science
    2011 IEEE International Geoscience and Remote Sensing Symposium
  • 2011
It is found that the covariance-matrix approach has a great potential of reaching a real-time performance and compared the performance between them and their CPU counterparts.
...

References

SHOWING 1-3 OF 3 REFERENCES

Efficient Gram–Schmidt orthonormalisation on parallel computers

The paper shows how these algorithms can be implemented on a parallel computer, and how their communication overhead can be minimized, and provides some guidelines for selecting the most appropriate algorithm.

A User's Guide to Principal Components.

This book discusses PCA with more than two Variables, Matrix Algebra Associated with Principal Component Analysis, and other Applications of PCA.

Loss and Recapture of Orthogonality in the Modified Gram-Schmidt Algorithm

The special structure of the product of the Householder transformations is derived, and then used to explain and bound the loss of orthogonality in MGS, which is illustrated by deriving a numerically stable algorithm based on MGS for a class of problems which includes solution of nonsingular linear systems.