• Corpus ID: 231846710

# Streaming k-PCA: Efficient guarantees for Oja's algorithm, beyond rank-one updates

@inproceedings{Huang2021StreamingKE,
title={Streaming k-PCA: Efficient guarantees for Oja's algorithm, beyond rank-one updates},
author={De Huang and Jonathan Niles-Weed and Rachel A. Ward},
booktitle={Annual Conference Computational Learning Theory},
year={2021}
}
• Published in
Annual Conference…
6 February 2021
• Computer Science
We analyze Oja’s algorithm for streaming k-PCA, and prove that it achieves performance nearly matching that of an optimal offline algorithm. Given access to a sequence of i.i.d. d× d symmetric matrices, we show that Oja’s algorithm can obtain an accurate approximation to the subspace of the top k eigenvectors of their expectation using a number of samples that scales polylogarithmically with d. Previously, such a result was only known in the case where the updates have rank one. Our analysis is…
It is shown that the Grassmannian Rank-One Subspace Estimation (GROUSE) algorithm is indeed equivalent to Oja’s algorithm in the sense that, at each iteration, given a step size for one of the algorithms, it may construct a step sizes for the other algorithm that results in an identical update.
It is proved that with high probability Oja’s algorithm performs an efficient, gap-free, global convergence rate to approximate an principal component subspace for any sub-Gaussian distribution.
• Mathematics, Computer Science
NeurIPS
• 2021
A weighted χ 2 approximation result is established for the sin 2 error between the population eigenvector and the output of Oja’s algorithm, thereby establishing the bootstrap as a consistent inferential method in an appropriate asymptotic regime.
• Computer Science
• 2022
A stochastic Gauss-Newton (SGN) algorithm to study the online principal component analysis (OPCA) problem, which is formulated by using the symmetric low-rank product model for dominant eigenspace calculation, is proposed.
• Computer Science
• 2019
This work considers streaming principal component analysis when the stochastic data-generating model is subject to perturbations and provides fundamental limits on convergence of any algorithm recovering principal components.
• Computer Science
EC
• 2022
This work shows how to design a content recommendations which can achieve approximate stationarity, under mild conditions on the set of available content, when a user's preferences are known, and how one can learn enough about a users' preferences to implement such a strategy even when user preferences are initially unknown.
• Computer Science
ArXiv
• 2021
The correspondence between Gaussian process regression and Geometric Harmonics is discussed, providing alternative interpretations of uncertainty in terms of error estimation, or leading towards accelerated Bayesian Optimization due to dimensionality reduction.

## References

SHOWING 1-10 OF 35 REFERENCES

• Computer Science
2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
• 2017
The results match the information theoretic lower bound in terms of dependency on error, on eigengap, on rank k, and on dimension d, up to poly-log factors.
• Computer Science
AISTATS
• 2016
This paper analyzes the convergence rate of a representative algorithm with decayed learning rate (Oja and Karhunen, 1985) in the first family for the general $k>1$ case and proposes a novel algorithm for the second family that sets the block sizes automatically and dynamically with faster convergence rate.
• Computer Science
ArXiv
• 2019
AdaOja is a novel variation of the Adagrad algorithm to Oja's algorithm in the single eigen vector case and extended to the multiple eigenvector case and it is demonstrated for dense synthetic data, sparse real-world data and dense real- world data that AdaOja outperforms common learning rate choices for Oja’s method.
• O. Shamir
• Computer Science, Mathematics
ICML
• 2016
The convergence properties of the VR-PCA algorithm introduced by Shamir 2015stochastic are studied, including a formal analysis of a block version of the algorithm, and convergence from random initialization is proved.
This paper provides the first eigengap-free convergence guarantees for SGD in the context of PCA in a streaming stochastic setting, and shows that the same techniques lead to new SGD convergence guarantees with better dependence on the eIGengap.
• Computer Science, Mathematics
STOC
• 2018
A query complexity lower bound for approximating the top r dimensional eigenspace of a matrix and establishes a strict separation between convex optimization and “strict-saddle” non-convex optimization of which PCA is a canonical example is established.
• Computer Science
NIPS
• 2013
The top eigenvector of A is computed in an incremental fashion - with an algorithm that maintains an estimate of the top Eigenvector in O(d) space, and incrementally adjusts the estimate with each new data point that arrives.
• Computer Science
NIPS
• 2014
A new robust convergence analysis of the well-known power method for computing the dominant singular vectors of a matrix that is called the noisy power method is provided and shows that the error dependence of the algorithm on the matrix dimension can be replaced by an essentially tight dependence on the coherence of the matrix.
• Computer Science
ICML
• 2015
This paper exhibits a step size scheme for SGD on a low-rank least-squares problem, and proves that, under broad sampling conditions, the method converges globally from a random starting point within $O(\epsilon^{-1} n \log n)$ steps with constant probability for constant-rank problems.
• Computer Science
NIPS
• 2013
An algorithm is presented that uses O(kp) memory and is able to compute the k-dimensional spike with O(p log p) sample-complexity - the first algorithm of its kind.