• Corpus ID: 13466949

Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

@inproceedings{Musco2017IsIS,
  title={Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?},
  author={Cameron Musco and David P. Woodruff},
  booktitle={NIPS},
  year={2017}
}
Low-rank approximation is a common tool used to accelerate kernel methods: the $n \times n$ kernel matrix $K$ is approximated via a rank-$k$ matrix $\tilde K$ which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, computing a relative error $k$-rank approximation to $K$ is at least as difficult… 
Finding a latent k-polytope in O(k . nnz(data)) time via Subset Smoothing.
In this paper we show that a large class of Latent variable models, such as Mixed Membership Stochastic Block(MMSB) Models, Topic Models, and Adversarial Clustering, can be unified through a
Impossibility Results for Grammar-Compressed Linear Algebra
TLDR
This paper considers lossless compression schemes, and asks if one can run computations on the compressed data as efficiently as if the original data was that small, and considers the most basic linear algebra operations: inner product, matrix-vector multiplication, and matrix multiplication.
Near Optimal Frequent Directions for Sketching Dense and Sparse Matrices
TLDR
New space-optimal algorithms with faster running times are provided and it is shown that the running times of these algorithms are near-Optimal unless the state-of-the-art running time of matrix multiplication can be improved significantly.
Statistical and Computational Trade-Offs in Kernel K-Means
TLDR
It is proved under basic assumptions that sampling Nystr\"om landmarks allows to greatly reduce computational costs without incurring in any loss of accuracy, the first result showing in this kind for unsupervised learning.
Generalized Leverage Score Sampling for Neural Networks
TLDR
The equivalence between regularized neural network and neural tangent kernel ridge regression under the initialization of both classical random Gaussian and leverage score sampling is proved.
The power of randomized algorithms: from numerical linear algebra to biological systems
TLDR
A randomized low-rank approximation algorithm for positive semidefinite matrices that runs in sublinear time is given, significantly improving upon what is possible with traditional deterministic methods.
Retraining-free methods for fast on-the-fly pruning of convolutional neural networks
TLDR
Three model-independent methods for sparsification of model weights are proposed and evaluated, which are magnitude-based, efficient, and can be applied on-the-fly during model load time, which is necessary in some deployment contexts.
Fast On-the-fly Retraining-free Sparsification of Convolutional Neural Networks
TLDR
It is shown that the state-of-the-art models' weights can be reduced by up to 73% (compression factor of 3.7x) without incurring more than 5% loss in Top-5 accuracy, which indicates that the fast on- the-fly methods are effective.
Fast Onthe-fly Retraining-free Sparsification of Convolutional Neural Networks
Modern Convolutional Neural Networks (CNNs) are complex, encompassing millions of parameters. Their deployment exerts computational, storage and energy demands, particularly on embedded platforms.
Input-Sparsity Low Rank Approximation in Schatten Norm
TLDR
The first input-sparsity time algorithms for the rank-$k$ low rank approximation problem in every Schatten norm are given, with a better dependence on $n$ than that in the singular value decomposition for every $p$.
...
1
2
...

References

SHOWING 1-10 OF 34 REFERENCES
Input Sparsity Time Low-rank Approximation via Ridge Leverage Score Sampling
We present a new algorithm for finding a near optimal low-rank approximation of a matrix $A$ in $O(nnz(A))$ time. Our method is based on a recursive sampling scheme for computing a representative
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
TLDR
An algorithm to compute an easily-interpretable low-rank approximation to an n x n Gram matrix G such that computations of interest may be performed more rapidly.
Low-Rank PSD Approximation in Input-Sparsity Time
TLDR
This work gives algorithms for approximation by low-rank positive semidefinite (PSD) matrices, and shows that there are asymmetric input matrices that cannot have good symmetric column-selected approximations.
Dimensionality Reduction for k-Means Clustering and Low Rank Approximation
TLDR
This work shows how to approximate a data matrix A with a much smaller sketch ~A that can be used to solve a general class of constrained k-rank approximation problems to within (1+ε) error, and gives a simple alternative to known algorithms that has applications in the streaming setting.
Efficient SVM Training Using Low-Rank Kernel Representations
TLDR
This work shows that for a low rank kernel matrix it is possible to design a better interior point method (IPM) in terms of storage requirements as well as computational complexity and derives an upper bound on the change in the objective function value based on the approximation error and the number of active constraints (support vectors).
Low rank approximation and regression in input sparsity time
TLDR
The fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and l<sub>p</sub>-regression are obtained.
How to Fake Multiply by a Gaussian Matrix
TLDR
This work proposes a new randomized $m \times n$ matrix T, for which one can compute T \cdot X in only $O(\text{nnz}(X) m) + \tilde O(m^{1.5} \CDot d^{3})$ time, and applies this transform to nonnegative matrix factorization (NMF) and support vector machines (SVM).
Sublinear Time Low-Rank Approximation of Positive Semidefinite Matrices
TLDR
It is shown how to compute a relative-error low-rank approximation to any positive semidefinite (PSD) matrix in sublinear time, and time lower bounds for low- rank approximation of PSD matrices are proved.
Improved Nyström low-rank approximation and error analysis
TLDR
An error analysis that directly relates the Nyström approximation quality with the encoding powers of the landmark points in summarizing the data is presented, and the resultant error bound suggests a simple and efficient sampling scheme, the k-means clustering algorithm, for NyStröm low-rank approximation.
Subspace Embeddings for the Polynomial Kernel
TLDR
This work proposes the first fast oblivious subspace embeddings that are able to embed a space induced by a non-linear kernel without explicitly mapping the data to the high-dimensional space.
...
1
2
3
4
...