• Corpus ID: 16658740

Subspace Embeddings for the Polynomial Kernel

@inproceedings{Avron2014SubspaceEF,
  title={Subspace Embeddings for the Polynomial Kernel},
  author={Haim Avron and Huy L. Nguyen and David P. Woodruff},
  booktitle={NIPS},
  year={2014}
}
Sketching is a powerful dimensionality reduction tool for accelerating statistical learning algorithms. However, its applicability has been limited to a certain extent since the crucial ingredient, the so-called oblivious subspace embedding, can only be applied to data spaces with an explicit representation as the column span or row span of a matrix, while in many settings learning is done in a high-dimensional space implicitly defined by the data matrix via a kernel transformation. We propose… 

Tables from this paper

Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling
TLDR
A near input sparsity time algorithm for sampling the high-dimensional feature space implicitly defined by a kernel transformation, and shows how its subspace embedding bounds imply new statistical guarantees for kernel ridge regression.
Oblivious Sketching of High-Degree Polynomial Kernels
TLDR
This work is a general method for applying sketching solutions developed in numerical linear algebra over the past decade to a tensoring of data points without forming the tensoring explicitly, and leads to the first oblivious sketch for the polynomial kernel with a target dimension that is only polynomially dependent on the degree of the kernel function.
Relative Error RKHS Embeddings for Gaussian Kernels
TLDR
The main insight is to effectively modify the well-traveled random Fourier features to be slightly biased and have higher variance, but so they can be defined as a convolution over the function space.
Faster Kernel Ridge Regression Using Sketching and Preconditioning
TLDR
This paper proposes a preconditioning technique based on random feature maps, such as random Fourier features, which have recently emerged as a powerful technique for speeding up and scaling the training of kernel-based methods by resorting to approximations.
In-Database Regression in Input Sparsity Time
TLDR
This work design subspace embeddings for database joins which can be computed significantly faster than computing the join, and extends the results to arbitrary joins for the ridge regression problem, significantly improving upon the running time of prior FAQ-based methods for regression.
N ov 2 01 8 Relative Error RKHS Embeddings for Gaussian Kernels
TLDR
The main insight is to effectively modify the well-traveled random Fourier features to be slightly biased and have higher variance, but so they can be defined as a convolution over the function space.
Fast Sketching of Polynomial Kernels of Polynomial Degree
TLDR
A new oblivious sketch is given which greatly improves the running time of the fastest algorithms for approximating a large family of slow-growing kernels, by removing the dependence on q in the leading order term.
Exponentially Improved Dimensionality Reduction for 𝓁1: Subspace Embeddings and Independence Testing
TLDR
The linear map gives a streaming algorithm for independence testing using space 2 2)(ε−1 log d) O(q) space bound of Braverman and Ostrovsky (STOC, 2010), and for subspace embeddings, the setting when A is itself drawn from distributions with independent entries is studied, and a polynomial embedding dimension is obtained.
Relative Error Embeddings of the Gaussian Kernel Distance
TLDR
It is shown in this paper that for the Gaussian kernel the Euclidean norm between these mapped to features has $(1+\epsilon)-relative error with respect to the kernel distance.
Lower Memory Oblivious (Tensor) Subspace Embeddings with Fewer Random Bits: Modewise Methods for Least Squares
TLDR
Applications related to compression and fast compressed least squares solution methods are considered, including those used for fitting low-rank CP decompositions, and the proposed JL embedding results are shown to work well numerically in both settings.
...
...

References

SHOWING 1-10 OF 17 REFERENCES
Fast and scalable polynomial kernels via explicit feature maps
TLDR
A novel randomized tensor product technique, called Tensor Sketching, is proposed for approximating any polynomial kernel in O(n(d+D \log{D})) time, and achieves higher accuracy and often runs orders of magnitude faster than the state-of-the-art approach for large-scale real-world datasets.
Low-Rank Approximation and Regression in Input Sparsity Time
We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r, with probability at least 9/10, ∥SAx∥2 = (1 ± ε)∥Ax∥2 simultaneously for all x ∈ Rd. Here, m is
Compact Random Feature Maps
TLDR
The error bounds of CRAFT maps are proved demonstrating their superior kernel reconstruction performance compared to the previous approximation schemes, and it is shown how structured random matrices can be used to efficiently generate CRAFTMaps.
Numerical linear algebra in the streaming model
TLDR
Near-optimal space bounds are given in the streaming model for linear algebra problems that include estimation of matrix products, linear regression, low-rank approximation, and approximation of matrix rank; results for turnstile updates are proved.
Sketching Structured Matrices for Faster Nonlinear Regression
TLDR
This work considers a class of structured regression problems which involve Vandermonde matrices which arise naturally in various statistical modeling settings, and shows that this structure can be exploited to further accelerate the solution of the regression problem.
Random Features for Large-Scale Kernel Machines
TLDR
Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.
OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings
TLDR
The main result is essentially a Bai-Yin type theorem in random matrix theory and is likely to be of independent interest: for any fixed U ∈ R<sup>n×d</sup> with orthonormal columns and random sparse Π, all singular values of ΠU lie in [1 - ε, 1 + ε] with good probability.
Fastfood: Approximate Kernel Expansions in Loglinear Time
TLDR
Improvements to Fastfood, an approximation that accelerates kernel methods significantly and achieves similar accuracy to full kernel expansions and Random Kitchen Sinks while being 100x faster and using 1000x less memory, make kernel methods more practical for applications that have large training sets and/or require real-time prediction.
Relative-Error CUR Matrix Decompositions
TLDR
These two algorithms are the first polynomial time algorithms for such low-rank matrix approximations that come with relative-error guarantees; previously, in some cases, it was not even known whether such matrix decompositions exist.
Randomized Algorithms for Matrices and Data
TLDR
This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis.
...
...