Nyström Sketches
@inproceedings{Perry2017NystrmS, title={Nystr{\"o}m Sketches}, author={D. Perry and Braxton Osting and Ross T. Whitaker}, booktitle={ECML/PKDD}, year={2017} }
Despite prolific success, kernel methods become difficult to use in many large scale unsupervised problems because of the evaluation and storage of the full Gram matrix. Here we overcome this difficulty by proposing a novel approach: compute the optimal small, out-of-sample Nyström sketch which allows for fast approximation of the Gram matrix via the Nyström method. We demonstrate and compare several methods for computing the optimal Nyström sketch and show how this approach outperforms…
References
SHOWING 1-10 OF 29 REFERENCES
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2005
An algorithm to compute an easily-interpretable low-rank approximation to an n x n Gram matrix G such that computations of interest may be performed more rapidly.
Fast Randomized Kernel Methods With Statistical Guarantees
- Computer ScienceArXiv
- 2014
A version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance is described, and a new notion of the statistical leverage of a data point captures in a fine way the difficulty of the original statistical learning problem.
Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison
- Computer ScienceNIPS
- 2012
It is shown that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nystrom method can yield impressively better generalization error bound than random Fourier features based approach.
Greedy Spectral Embedding
- Computer ScienceAISTATS
- 2005
A greedy selection procedure for this subset of m examples, based on the featurespace distance between a candidate example and the span of the previously chosen ones, to estimate the embedding function based on all the data.
Fastfood: Approximate Kernel Expansions in Loglinear Time
- Computer ScienceICML 2013
- 2013
Improvements to Fastfood, an approximation that accelerates kernel methods significantly and achieves similar accuracy to full kernel expansions and Random Kitchen Sinks while being 100x faster and using 1000x less memory, make kernel methods more practical for applications that have large training sets and/or require real-time prediction.
Random Features for Large-Scale Kernel Machines
- Computer ScienceNIPS
- 2007
Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.
Distributed Adaptive Sampling for Kernel Matrix Approximation
- Computer ScienceAISTATS
- 2017
SQUEAK is the first RLS sampling algorithm that never constructs the whole matrix and runs in linear time, storing a dictionary which creates accurate kernel matrix approximations with a number of points that only depends on the effective dimension of the dataset.
Sparse Kernel Feature Analysis
- Computer Science, Mathematics
- 2002
A new class of feature extractors employing l1 norms in coefficient space instead of the Reproducing Kernel Hilbert Space in which KPCA was originally formulated in is proposed, allowing it to efficiently extract features which maximize criteria other than the variance in a way similar to projection pursuit.
The pre-image problem in kernel methods
- Computer ScienceIEEE Transactions on Neural Networks
- 2004
This paper addresses the problem of finding the pre-image of a feature vector in the feature space induced by a kernel and proposes a new method which directly finds the location of thePre-image based on distance constraints in thefeature space.
Incremental Kernel Principal Component Analysis
- Computer ScienceIEEE Transactions on Image Processing
- 2007
The basis of the proposed solution lies in computing incremental linear PCA in the kernel induced feature space, and constructing reduced-set expansions to maintain constant update speed and memory usage.