Oblivious Sketching of High-Degree Polynomial Kernels

  title={Oblivious Sketching of High-Degree Polynomial Kernels},
  author={Thomas Dybdahl Ahle and Michael Kapralov and Jakob B{\ae}k Tejs Knudsen and R. Pagh and Ameya Velingker and David P. Woodruff and Amir Zandieh},
Kernel methods are fundamental tools in machine learning that allow detection of non-linear dependencies between data without explicitly constructing feature vectors in high dimensional spaces. A major disadvantage of kernel methods is their poor scalability: primitives such as kernel PCA or kernel ridge regression generally take prohibitively large quadratic space and (at least) quadratic time, as kernel matrices are usually dense. Some methods for speeding up kernel linear algebra are known… 

Figures from this paper

Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling

A near input sparsity time algorithm for sampling the high-dimensional feature space implicitly defined by a kernel transformation, and shows how its subspace embedding bounds imply new statistical guarantees for kernel ridge regression.

The GaussianSketch for Almost Relative Error Kernel Distance

We introduce two versions of a new sketch for approximately embedding the Gaussian kernel into Euclidean inner product space. These work by truncating infinite expansions of the Gaussian kernel, and

Scaling up Kernel Ridge Regression via Locality Sensitive Hashing

A simple weighted version of random binning features is introduced and it is shown that the corresponding kernel function generates Gaussian processes of any desired smoothness, leading to efficient algorithms for kernel ridge regression.

Tensor Methods for Nonlinear Matrix Completion

A LADMC algorithm that leverages existing LRMC methods on a tensorized representation of the data and outperforms existing state-of-the-art methods for matrix completion under a union of subspaces model is proposed.

Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix

The main idea underlying the method is to combine a polynomial approximation of f with the existing tensor sketch scheme for approximating monomials of entries of A to balance the errors of the two approximation components in an optimal manner.

Tensorized Random Projections

The theoretical analysis shows that the dense Gaussian matrix in JLT can be replaced by a low-rank tensor implicitly represented in compressed form with random factors, while still approximately preserving the Euclidean distance of the projected inputs.

Faster Johnson-Lindenstrauss Transforms via Kronecker Products

The KFJLT reduces the embedding cost by an exponential factor of the standard fast Johnson–Lindenstrauss transform’s cost when applied to vectors with Kronecker structure, by avoiding explicitly forming the full Kr onecker products.

Fourier Sparse Leverage Scores and Approximate Kernel Learning

New explicit upper bounds on the leverage scores of Fourier sparse functions under both the Gaussian and Laplace measures are proved, which generalize existing work that only applies to uniformly distributed data.

Kernel Density Estimation through Density Constrained Near Neighbor Search

This paper gives a single data structure based on classical near neighbor search techniques that improves upon or essentially matches the query time and space complexity for all radial kernels considered in the literature so far, and shows how to improve both the query complexity and runtime by using recent advances in data-dependentNear neighbor search.

Tensor Train Random Projection

This work proposes a novel tensor train random projection (TTRP) method for dimension reduction, where the pairwise distances can be approximately preserved and the isometry property of TTRP is proven.



Subspace Embeddings for the Polynomial Kernel

This work proposes the first fast oblivious subspace embeddings that are able to embed a space induced by a non-linear kernel without explicitly mapping the data to the high-dimensional space.

Fast and scalable polynomial kernels via explicit feature maps

A novel randomized tensor product technique, called Tensor Sketching, is proposed for approximating any polynomial kernel in O(n(d+D \log{D})) time, and achieves higher accuracy and often runs orders of magnitude faster than the state-of-the-art approach for large-scale real-world datasets.

Faster Kernel Ridge Regression Using Sketching and Preconditioning

This paper proposes a preconditioning technique based on random feature maps, such as random Fourier features, which have recently emerged as a powerful technique for speeding up and scaling the training of kernel-based methods by resorting to approximations.

Fast Randomized Kernel Ridge Regression with Statistical Guarantees

A version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance is described, and a fast algorithm is presented to quickly compute coarse approximations to these scores in time linear in the number of samples.

Improved Approximation Algorithms for Large Matrices via Random Projections

  • Tamás Sarlós
  • Computer Science
    2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
  • 2006
The key idea is that low dimensional embeddings can be used to eliminate data dependence and provide more versatile, linear time pass efficient matrix computation.

Nearly Tight Oblivious Subspace Embeddings by Trace Inequalities

This analysis of sparse oblivious subspace embeddings is presented, based on the "matrix Chernoff" technique, and the bounds obtained are much tighter than previous ones, matching known lower bounds up to a single log(d) factor in embedding dimension.

Optimal Approximate Matrix Product in Terms of Stable Rank

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having

Sketching as a Tool for Numerical Linear Algebra

This survey highlights the recent advances in algorithms for numericallinear algebra that have come from the technique of linear sketching, and considers least squares as well as robust regression problems, low rank approximation, and graph sparsification.

Randomized sketches of convex programs with sharp guarantees

This work analyzes RP-based approximations of convex programs, in which the original optimization problem is approximated by the solution of a lower-dimensional problem, and proves that the approximation ratio of this procedure can be bounded in terms of the geometry of constraint set.

Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees

The results are twofold: on the one hand, it is shown that random Fourier feature approximation can provably speed up kernel ridge regression under reasonable assumptions, and on the other hand, the method is suboptimal, and sampling from a modified distribution in Fourier space, given by the leverage function of the kernel, yields provably better performance.