• Corpus ID: 235825478

Fast Sketching of Polynomial Kernels of Polynomial Degree

@article{Song2021FastSO,
  title={Fast Sketching of Polynomial Kernels of Polynomial Degree},
  author={Zhao Song and David P. Woodruff and Zheng Yu and Lichen Zhang},
  journal={ArXiv},
  year={2021},
  volume={abs/2108.09420}
}
Kernel methods are fundamental in machine learning, and faster algorithms for kernel approximation provide direct speedups for many core tasks in machine learning. The polynomial kernel is especially important as other kernels can often be approximated by the polynomial kernel via a Taylor series expansion. Recent techniques in oblivious sketching reduce the dependence in the running time on the degree q of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian… 

Tables from this paper

Improved Random Features for Dot Product Kernels

TLDR
A generalization of existing random feature approximations for polynomial kernels, such as Rademacher and Gaussian sketches and TensorSRHT, using complex-valued random features is presented and it is shown empirically that the use of complex features can significantly reduce the variances of these approxIMations.

Complex-to-Real Random Features for Polynomial Kernels

TLDR
Complexto-Real (CtR) random features for polynomial kernels that leverage intermediate complex random projections and can yield kernel estimates with much lower variances than their real-valued analogs.

Random Gegenbauer Features for Scalable Kernel Methods

TLDR
This work proposes efficient random features for approximating a new and rich class of kernel functions that it refers to as Generalized Zonal Kernels (GZK), and proves subspace embedding guarantees for Gegenbauer features which ensures that these features can be used for approximately solving learning problems such as kernel k-means clustering, kernel ridge regression, etc.

Speeding Up Sparsification using Inner Product Search Data Structures

TLDR
The heart of the work is the design of a variety of different inner product search data structures that have efficient initialization, query and update time, compatible to dimensionality reduction and robust against adaptive adversary.

Oblivious Sketching-based Central Path Method for Linear Programming

TLDR
This method enjoys the benefits of being both oblivious and feasible, and can use sparse sketching matrix (Nelson & Nguyên, 2013) to speed up the online matrix-vector multiplication.

Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

We propose an input sparsity time sampling algorithm that can spectrally approximate the Gram matrix corresponding to the q-fold column-wise tensor product of q matrices using a nearly optimal number

Fast Graph Neural Tangent Kernel via Kronecker Sketching

TLDR
This paper provides the first algorithm to construct the kernel matrix in o(n^2N^3) running time, which can be a major performance bottleneck when the size of graphs N increases.

Fast Algorithms for Monotone Lower Subsets of Kronecker Least Squares Problems

TLDR
This paper develops efficient leverage score-based sampling methods for matrices with certain Kronecker product-type structure, and numerical examples show that sketches based on exact leverage score sampling for a class of structured matrices achieve superior residual compared to approximate leverage scored sampling methods.

Differentially Private Learning with Margin Guarantees

TLDR
A series of new differentially private (DP) algorithms with dimension-independent margin guarantees for linear hypotheses and a general label DP learning algorithm, which benefits from relative deviation margin bounds and is applicable to a broad family of hypothesis sets, including that of neural networks.

Sublinear Time Algorithm for Online Weighted Bipartite Matching

TLDR
This work provides the theoretical foundation for computing the weights approximately and shows that, with the proposed randomized data structures, the weights can be computed in sublinear time while still preserving the competitive ratio of the matching algorithm.

References

SHOWING 1-10 OF 62 REFERENCES

Oblivious Sketching of High-Degree Polynomial Kernels

TLDR
This work is a general method for applying sketching solutions developed in numerical linear algebra over the past decade to a tensoring of data points without forming the tensoring explicitly, and leads to the first oblivious sketch for the polynomial kernel with a target dimension that is only polynomially dependent on the degree of the kernel function.

Subspace Embeddings for the Polynomial Kernel

TLDR
This work proposes the first fast oblivious subspace embeddings that are able to embed a space induced by a non-linear kernel without explicitly mapping the data to the high-dimensional space.

Fast Randomized Kernel Ridge Regression with Statistical Guarantees

TLDR
A version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance is described, and a fast algorithm is presented to quickly compute coarse approximations to these scores in time linear in the number of samples.

Faster Kernel Ridge Regression Using Sketching and Preconditioning

TLDR
This paper proposes a preconditioning technique based on random feature maps, such as random Fourier features, which have recently emerged as a powerful technique for speeding up and scaling the training of kernel-based methods by resorting to approximations.

Improved Approximation Algorithms for Large Matrices via Random Projections

  • Tamás Sarlós
  • Computer Science
    2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
  • 2006
TLDR
The key idea is that low dimensional embeddings can be used to eliminate data dependence and provide more versatile, linear time pass efficient matrix computation.

Sharp analysis of low-rank kernel matrix approximations

TLDR
This paper shows that in the context of kernel ridge regression, for approximations based on a random subset of columns of the original kernel matrix, the rank p may be chosen to be linear in the degrees of freedom associated with the problem, a quantity which is classically used in the statistical analysis of such methods.

Scaling up Kernel Ridge Regression via Locality Sensitive Hashing

TLDR
A simple weighted version of random binning features is introduced and it is shown that the corresponding kernel function generates Gaussian processes of any desired smoothness, leading to efficient algorithms for kernel ridge regression.

Recursive Sampling for the Nystrom Method

We give the first algorithm for kernel Nystrom approximation that runs in linear time in the number of training points and is provably accurate for all kernel matrices, without dependence on

Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees

TLDR
The results are twofold: on the one hand, it is shown that random Fourier feature approximation can provably speed up kernel ridge regression under reasonable assumptions, and on the other hand, the method is suboptimal, and sampling from a modified distribution in Fourier space, given by the leverage function of the kernel, yields provably better performance.

Preconditioning Kernel Matrices

TLDR
A scalable approach to both solving kernel machines and learning their hyperparameters is described, and it is shown this approach is exact in the limit of iterations and outperforms state-of-the-art approximations for a given computational budget.
...