# An Algorithm for the Principal Component Analysis of Large Data Sets

@article{Halko2010AnAF, title={An Algorithm for the Principal Component Analysis of Large Data Sets}, author={Nathan Halko and Per-Gunnar Martinsson and Yoel Shkolnisky and Mark Tygert}, journal={SIAM J. Scientific Computing}, year={2010}, volume={33}, pages={2580-2594} }

Recently popularized randomized methods for principal component analysis (PCA) efficiently and reliably produce nearly optimal accuracy—even on parallel processors—unlike the classical (deterministic) alternatives. We adapt one of these randomized methods for use with data sets that are too large to be stored in random-access memory (RAM). (The traditional terminology is that our procedure works efficiently out-of-core.) We illustrate the performance of the algorithm via several numerical… CONTINUE READING

Create an AI-powered research feed to stay up to date with new papers like this posted to ArXiv

#### Citations

##### Publications citing this paper.

SHOWING 1-10 OF 128 CITATIONS

## Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

VIEW 9 EXCERPTS

CITES BACKGROUND & METHODS

HIGHLY INFLUENCED

## Single-Pass PCA of Large High-Dimensional Data

VIEW 18 EXCERPTS

CITES METHODS, BACKGROUND & RESULTS

HIGHLY INFLUENCED

## Greedy Representative Selection for Unsupervised Data Analysis

VIEW 11 EXCERPTS

CITES METHODS

HIGHLY INFLUENCED

## Lazy Stochastic Principal Component Analysis

VIEW 4 EXCERPTS

CITES METHODS & BACKGROUND

HIGHLY INFLUENCED

## Randomized Matrix Decompositions using

VIEW 10 EXCERPTS

CITES BACKGROUND & METHODS

HIGHLY INFLUENCED

## Projecting "Better Than Randomly": How to Reduce the Dimensionality of Very Large Datasets in a Way That Outperforms Random Projections

VIEW 8 EXCERPTS

CITES METHODS & BACKGROUND

HIGHLY INFLUENCED

## Randomized Matrix Decompositions using R

VIEW 6 EXCERPTS

CITES METHODS & BACKGROUND

HIGHLY INFLUENCED

## Facial Expression Recognition and Analysis: A Comparison Study of Feature Descriptors

VIEW 9 EXCERPTS

CITES METHODS

HIGHLY INFLUENCED

## Informative Data Fusion: Beyond Canonical Correlation Analysis

VIEW 20 EXCERPTS

CITES BACKGROUND

HIGHLY INFLUENCED

## Pass-Efficient Randomized Algorithms for Low-Rank Matrix Approximation Using Any Number of Views

VIEW 7 EXCERPTS

CITES METHODS

HIGHLY INFLUENCED

### FILTER CITATIONS BY YEAR

### CITATION STATISTICS

**23**Highly Influenced Citations**Averaged 16 Citations**per year from 2017 through 2019**42% Increase**in citations per year in 2019 over 2018

#### References

##### Publications referenced by this paper.

SHOWING 1-10 OF 14 REFERENCES

## Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start

VIEW 4 EXCERPTS

HIGHLY INFLUENTIAL

## Matrix Computations

VIEW 3 EXCERPTS

HIGHLY INFLUENTIAL

## Normalized power iterations for the computation of SVD

VIEW 2 EXCERPTS

## A Randomized Algorithm for Principal Component Analysis

VIEW 3 EXCERPTS