Estimating leverage scores via rank revealing methods and randomization
@article{Sobczyk2021EstimatingLS, title={Estimating leverage scores via rank revealing methods and randomization}, author={Aleksandros Sobczyk and Efstratios Gallopoulos}, journal={ArXiv}, year={2021}, volume={abs/2105.11004} }
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms. We first develop a set of fast novel algorithms for rank estimation, column subset selection and least squares preconditioning. We then describe the design and implementation of leverage score estimators based on these primitives…
Figures and Tables from this paper
3 Citations
A quantum-inspired algorithm for approximating statistical leverage scores
- Computer ScienceArXiv
- 2021
This work proposes a quantum-inspired algorithm for approximating the statistical leverage scores of a matrix A and shows that this algorithm takes time polynomial in an integer k, condition number κ and logarithm of the matrix size.
pylspack: Parallel algorithms and data structures for sketching, column subset selection, regression and leverage scores
- Computer ScienceACM Transactions on Mathematical Software
- 2022
This work presents parallel algorithms and data structures for three fundamental operations in Numerical Linear Algebra, with a special focus on “tall-and-skinny” matrices, which arise in many applications.
Approximate Euclidean lengths and distances beyond Johnson-Lindenstrauss
- Computer ScienceArXiv
- 2022
An algorithm to estimate the Euclidean lengths of the rows of a matrix and proves element-wise probabilistic bounds that are at least as good as standard JL approximations in the worst-case, but are asymptotically better for matrices with decaying spectrum.
References
SHOWING 1-10 OF 82 REFERENCES
Revisiting the Nystrom Method for Improved Large-scale Machine Learning
- Computer ScienceJ. Mach. Learn. Res.
- 2016
An empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices and a suite of worst-case theoretical bounds for both random sampling and random projection methods are complemented.
An Empirical Evaluation of Sketched SVD and its Application to Leverage Score Ordering
- Computer ScienceACML
- 2018
This work presents Sketched Leverage Score Ordering, a technique for determining the ordering of data in the training of neural networks based on the distributed computation of leverage scores using random projections, which is faster compared to standard randomized projection algorithms and shows improvements in convergence and results.
Input Sparsity Time Low-rank Approximation via Ridge Leverage Score Sampling
- Computer ScienceSODA
- 2017
We present a new algorithm for finding a near optimal low-rank approximation of a matrix $A$ in $O(nnz(A))$ time. Our method is based on a recursive sampling scheme for computing a representative…
Provable deterministic leverage score sampling
- Computer ScienceKDD
- 2014
This work provides a novel theoretical analysis of deterministic leverage score sampling and shows that such sampling can be provably as accurate as its randomized counterparts, if the leverage scores follow a moderately steep power-law decay.
Fast approximation of matrix coherence and statistical leverage
- Computer ScienceICML
- 2012
A randomized algorithm is proposed that takes as input an arbitrary n × d matrix A, with n ≫ d, and returns, as output, relative-error approximations to all n of the statistical leverage scores.
Probabilistic Leverage Scores for Parallelized Unsupervised Feature Selection
- Computer ScienceIWANN
- 2017
The use of Probabilistic PCA is proposed to compute the leverage scores in O(mnk) time, enabling the applicability of some of these randomized methods to large, high-dimensional data sets and offering a parallelized version over the emerging Resilient Distributed Datasets paradigm on Apache Spark.
Augmented Leverage Score Sampling with Bounds
- Computer ScienceECML/PKDD
- 2016
An empirical evaluation of the proposed augmented leverage score performance on the column subsample selection problem (CSSP) as compared to the traditional leverage score and other methods in both a deterministic and probabilistic sampling paradigm is presented.
An Empirical Evaluation of Sketching for Numerical Linear Algebra
- Computer Science, MathematicsKDD
- 2018
This work investigates least squares regression, iteratively reweighted least squares, logistic regression, robust regression with Huber and Bisquare loss functions, leverage score computation, Frobenius norm low rank approximation, and entrywise $\ell_1$-low rank approximation.
Tighter Low-rank Approximation via Sampling the Leveraged Element
- Computer ScienceSODA
- 2015
This work proposes a new randomized algorithm for computing a low-rank approximation to a given matrix that combines the best aspects of otherwise disparate current results, but with a dependence on the condition number κ = σ1/σr.
Iterative Row Sampling
- Computer Science2013 IEEE 54th Annual Symposium on Foundations of Computer Science
- 2013
This work shows that alternating between computing a short matrix estimate and finding more accurate approximate leverage scores leads to a series of geometrically smaller instances that gives an algorithm whose runtime is input sparsity plus an overhead comparable to the cost of solving a regression problem on the smaller approximation.