# Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

@article{Liu2021RandomFF, title={Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond}, author={F. Liu and Xiaolin Huang and Yudong Chen and J. Suykens}, journal={IEEE transactions on pattern analysis and machine intelligence}, year={2021}, volume={PP} }

Random features is one of the most popular techniques to speed up kernel methods in large-scale problems. Related works have been recognized by the NeurIPS Test-of-Time award in 2017 and the ICML Best Paper Finalist in 2019. The body of work on random features has grown rapidly, and hence it is desirable to have a comprehensive overview on this topic explaining the connections among various algorithms and theoretical results. In this survey, we systematically review the work on random features… Expand

#### Figures, Tables, and Topics from this paper

#### 17 Citations

On the Approximation Lower Bound for Neural Nets with Random Weights

- Computer Science, Mathematics
- ArXiv
- 2020

It is shown that, despite the well-known fact that a shallow neural network is a universal approximator, a random net cannot achieve zero approximation error even for smooth functions, and it is proved that if the proposal distribution is compactly supported, then a lower bound is positive. Expand

Fast Learning in Reproducing Kernel Krein Spaces via Generalized Measures

- Computer Science
- 2020

In this paper, we attempt to solve a long-lasting open question in non-positive definite (non-PD) kernels: does a given non-PD kernel can be decomposed into the difference of two PD kernels (termed… Expand

Kernel approximation on algebraic varieties

- Computer Science, Mathematics
- ArXiv
- 2021

The main technical insight is to approximate smooth kernels by polynomial kernels, and leverage two key properties of polynometric kernels that hold when they are restricted to a variety. Expand

Learning Data-adaptive Nonparametric Kernels

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2020

A Data-Adaptive Nonparametric Kernel (DANK) learning framework by imposing an adaptive matrix on the kernel/Gram matrix in an entry-wise strategy that outperforms other representative kernel learning based algorithms on various classification and regression benchmark datasets. Expand

Global Convergence and Induced Kernels of Gradient-Based Meta-Learning with Neural Nets

- Computer Science, Mathematics
- ArXiv
- 2020

It is proved that GBML is equivalent to a functional gradient descent operation that explicitly propagates experience from the past tasks to new ones and a new kernel-based meta-learning approach is developed that outperforms GBML with standard DNNs on the Omniglot dataset when the number of past tasks for meta-training is small. Expand

Shallow Representation is Deep: Learning Uncertainty-aware and Worst-case Random Feature Dynamics

- Computer Science, Engineering
- ArXiv
- 2021

It is shown that finding worst-case dynamics realizations using Pontryagin’s minimum principle is equivalent to performing the Frank-Wolfe algorithm on the deep net, and the whole dynamical system is viewed as a multi-layer neural network. Expand

Sample and Computationally Efficient Simulation Metamodeling in High Dimensions

- Mathematics, Computer Science
- 2020

This work develops a novel methodology that dramatically alleviates the curse of dimensionality, and demonstrates via extensive numerical experiments that the methodology can handle problems with a design space of hundreds of dimensions, improving both prediction accuracy and computational efficiency by orders of magnitude relative to typical alternative methods in practice. Expand

An Insect-Inspired Randomly, Weighted Neural Network with Random Fourier Features For Neuro-Symbolic Relational Learning

- Computer Science
- ArXiv
- 2021

The computer-science field of Knowledge Representation and Reasoning (KRR) aims to understand, reason, and interpret knowledge as efficiently as human beings do. Because many logical formalisms and… Expand

Fast Learning in Reproducing Kernel Krein Spaces via Signed Measures

- Computer Science
- AISTATS
- 2021

This paper casts this question as a distribution view by introducing the signed measure, which transforms positive decomposition to measure decomposition: a series of non-PD kernels can be associated with the linear combination of specific finite Borel measures and provides a sufficient and necessary condition to answer this open question. Expand

Kernel regression in high dimension: Refined analysis beyond double descent

- Computer Science, Mathematics
- AISTATS
- 2021

This refined analysis goes beyond the double descent theory by showing that, depending on the data eigen-profile and the level of regularization, the kernel regression risk curve can be a double-descent-like, bell-shaped, or monotonic function of $n$. Expand

#### References

SHOWING 1-10 OF 213 REFERENCES

On Data-Dependent Random Features for Improved Generalization in Supervised Learning

- Computer Science, Mathematics
- AAAI
- 2018

This paper proposes the Energy-based Exploration of Random Features (EERF) algorithm based on a data-dependent score function that explores the set of possible features and exploits the promising regions and proves that the proposed score function with high probability recovers the spectrum of the best fit within the model class. Expand

A General Scoring Rule for Randomized Kernel Approximation with Application to Canonical Correlation Analysis

- Computer Science, Mathematics
- ArXiv
- 2019

A general scoring rule for sampling random features, which can be employed for various applications with some adjustments and provides a principled guide for finding the distribution maximizing the canonical correlations, resulting in a novel data-dependent method for sampling features. Expand

Random Features for Shift-Invariant Kernels with Moment Matching

- Computer Science
- AAAI
- 2017

This paper presents a novel sampling algorithm powered by moment matching techniques to reduce the variance of random features and proves the superiority of the proposed algorithm in Gram matrix approximation and generalization errors in regression. Expand

Optimal Rates for Random Fourier Features

- Computer Science, Mathematics
- NIPS
- 2015

A detailed finite-sample theoretical analysis about the approximation quality of RFFs is provided by establishing optimal (in terms of the RFF dimension, and growing set size) performance guarantees in uniform norm, and presenting guarantees in Lr (1 ≤ r < ∞) norms. Expand

Data-driven Random Fourier Features using Stein Effect

- Computer Science, Mathematics
- IJCAI
- 2017

A novel shrinkage estimator from "Stein effect", which provides a data-driven weighting strategy for random features and enjoys theoretical justifications in terms of lowering the empirical risk and an efficient randomized algorithm for large-scale applications of the proposed method are presented. Expand

Towards a Unified Analysis of Random Fourier Features

- Computer Science, Mathematics
- ICML
- 2019

This work provides the first unified risk analysis of learning with random Fourier features using the squared error and Lipschitz continuous loss functions and devise a simple approximation scheme which provably reduces the computational cost without loss of statistical efficiency. Expand

Data-dependent compression of random features for large-scale kernel approximation

- Computer Science, Mathematics
- AISTATS
- 2019

This work proposes to combine the simplicity and generality of RFMs with a data-dependent feature selection scheme to achieve desirable theoretical approximation properties of Nystrom with just O(log J+) features, and shows that the method achieves small kernel matrix approximation error and better test set accuracy with provably fewer random features than state-of-the-art methods. Expand

Random Fourier Features via Fast Surrogate Leverage Weighted Sampling

- Computer Science, Mathematics
- AAAI
- 2020

A fast surrogate leverage weighted sampling strategy to generate refined random Fourier features for kernel approximation and provides theoretical guarantees on the generalization performance of this approach, and in particular characterize the number of random features required to achieve statistical guarantees in KRR. Expand

On the Error of Random Fourier Features

- Computer Science, Mathematics
- UAI
- 2015

The uniform error bound of that paper on random Fourier features is improved, as well as giving novel understandings of the embedding's variance, approximation error, and use in some machine learning methods. Expand

Simple and Almost Assumption-Free Out-of-Sample Bound for Random Feature Mapping

- Mathematics, Computer Science
- ArXiv
- 2019

This paper studies kernel ridge regression with random feature mapping (RFM-KRR) and establishes novel out-of-sample error upper and lower bounds and is completely based on elementary linear algebra and thereby easy to read and verify. Expand