# Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

@inproceedings{Musco2017IsIS, title={Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?}, author={Cameron Musco and David P. Woodruff}, booktitle={NIPS}, year={2017} }

Low-rank approximation is a common tool used to accelerate kernel methods: the $n \times n$ kernel matrix $K$ is approximated via a rank-$k$ matrix $\tilde K$ which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, computing a relative error $k$-rank approximation to $K$ is at least as difficult…

## Topics from this paper

## 11 Citations

Finding a latent k-polytope in O(k . nnz(data)) time via Subset Smoothing.

- Mathematics
- 2019

In this paper we show that a large class of Latent variable models, such as Mixed Membership Stochastic Block(MMSB) Models, Topic Models, and Adversarial Clustering, can be unified through a…

Impossibility Results for Grammar-Compressed Linear Algebra

- Computer Science, MathematicsNeurIPS
- 2020

This paper considers lossless compression schemes, and asks if one can run computations on the compressed data as efficiently as if the original data was that small, and considers the most basic linear algebra operations: inner product, matrix-vector multiplication, and matrix multiplication.

Near Optimal Frequent Directions for Sketching Dense and Sparse Matrices

- Computer Science, MathematicsICML
- 2018

New space-optimal algorithms with faster running times are provided and it is shown that the running times of these algorithms are near-Optimal unless the state-of-the-art running time of matrix multiplication can be improved significantly.

Statistical and Computational Trade-Offs in Kernel K-Means

- Mathematics, Computer ScienceNeurIPS
- 2018

It is proved under basic assumptions that sampling Nystr\"om landmarks allows to greatly reduce computational costs without incurring in any loss of accuracy, the first result showing in this kind for unsupervised learning.

Generalized Leverage Score Sampling for Neural Networks

- Computer Science, MathematicsNeurIPS
- 2020

The equivalence between regularized neural network and neural tangent kernel ridge regression under the initialization of both classical random Gaussian and leverage score sampling is proved.

The power of randomized algorithms: from numerical linear algebra to biological systems

- Computer Science
- 2018

A randomized low-rank approximation algorithm for positive semidefinite matrices that runs in sublinear time is given, significantly improving upon what is possible with traditional deterministic methods.

Retraining-free methods for fast on-the-fly pruning of convolutional neural networks

- Mathematics, Computer ScienceNeurocomputing
- 2019

Three model-independent methods for sparsification of model weights are proposed and evaluated, which are magnitude-based, efficient, and can be applied on-the-fly during model load time, which is necessary in some deployment contexts.

Fast On-the-fly Retraining-free Sparsification of Convolutional Neural Networks

- Computer ScienceArXiv
- 2018

It is shown that the state-of-the-art models' weights can be reduced by up to 73% (compression factor of 3.7x) without incurring more than 5% loss in Top-5 accuracy, which indicates that the fast on- the-fly methods are effective.

Fast Onthe-fly Retraining-free Sparsification of Convolutional Neural Networks

- 2018

Modern Convolutional Neural Networks (CNNs) are complex, encompassing millions of parameters. Their deployment exerts computational, storage and energy demands, particularly on embedded platforms.…

Input-Sparsity Low Rank Approximation in Schatten Norm

- Computer Science, MathematicsICML
- 2020

The first input-sparsity time algorithms for the rank-$k$ low rank approximation problem in every Schatten norm are given, with a better dependence on $n$ than that in the singular value decomposition for every $p$.

## References

SHOWING 1-10 OF 34 REFERENCES

Input Sparsity Time Low-rank Approximation via Ridge Leverage Score Sampling

- Computer Science, MathematicsSODA
- 2017

We present a new algorithm for finding a near optimal low-rank approximation of a matrix $A$ in $O(nnz(A))$ time. Our method is based on a recursive sampling scheme for computing a representative…

On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2005

An algorithm to compute an easily-interpretable low-rank approximation to an n x n Gram matrix G such that computations of interest may be performed more rapidly.

Low-Rank PSD Approximation in Input-Sparsity Time

- Computer Science, MathematicsSODA
- 2017

This work gives algorithms for approximation by low-rank positive semidefinite (PSD) matrices, and shows that there are asymmetric input matrices that cannot have good symmetric column-selected approximations.

Dimensionality Reduction for k-Means Clustering and Low Rank Approximation

- Mathematics, Computer ScienceSTOC
- 2015

This work shows how to approximate a data matrix A with a much smaller sketch ~A that can be used to solve a general class of constrained k-rank approximation problems to within (1+ε) error, and gives a simple alternative to known algorithms that has applications in the streaming setting.

Efficient SVM Training Using Low-Rank Kernel Representations

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2001

This work shows that for a low rank kernel matrix it is possible to design a better interior point method (IPM) in terms of storage requirements as well as computational complexity and derives an upper bound on the change in the objective function value based on the approximation error and the number of active constraints (support vectors).

Low rank approximation and regression in input sparsity time

- Mathematics, Computer ScienceSTOC '13
- 2013

The fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and l<sub>p</sub>-regression are obtained.

How to Fake Multiply by a Gaussian Matrix

- Computer Science, MathematicsICML
- 2016

This work proposes a new randomized $m \times n$ matrix T, for which one can compute T \cdot X in only $O(\text{nnz}(X) m) + \tilde O(m^{1.5} \CDot d^{3})$ time, and applies this transform to nonnegative matrix factorization (NMF) and support vector machines (SVM).

Sublinear Time Low-Rank Approximation of Positive Semidefinite Matrices

- Mathematics, Computer Science2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
- 2017

It is shown how to compute a relative-error low-rank approximation to any positive semidefinite (PSD) matrix in sublinear time, and time lower bounds for low- rank approximation of PSD matrices are proved.

Improved Nyström low-rank approximation and error analysis

- Mathematics, Computer ScienceICML '08
- 2008

An error analysis that directly relates the Nyström approximation quality with the encoding powers of the landmark points in summarizing the data is presented, and the resultant error bound suggests a simple and efficient sampling scheme, the k-means clustering algorithm, for NyStröm low-rank approximation.

Subspace Embeddings for the Polynomial Kernel

- Computer Science, MathematicsNIPS
- 2014

This work proposes the first fast oblivious subspace embeddings that are able to embed a space induced by a non-linear kernel without explicitly mapping the data to the high-dimensional space.