• Corpus ID: 21755532

A representer theorem for deep kernel learning

  title={A representer theorem for deep kernel learning},
  author={Bastian Bohn and Michael Griebel and Christian Rieger},
In this paper we provide a representer theorem for the concatenation of (linear combinations of) kernel functions of reproducing kernel Hilbert spaces. This result serves as mathematical foundation for the analysis of machine learning algorithms based on compositions of functions. As a direct consequence, the corresponding infinite-dimensional minimization problems can be recast into (nonlinear) finite-dimensional minimization problems, which can be tackled with nonlinear optimization… 

Figures from this paper

2 What is an RKHS ?

The representer theorem for various learning problems under the reproducing kernel Hilbert spaces framework is reviewed, with solutions to the penalized least squares and penalized likelihood for nonparametric regression, and support vector machines for classification as a solution to the Penalized hinge loss.

Universality and Optimality of Structured Deep Kernel Networks

A recent deep kernel representer theorem is leverage to connect the two approaches and understand their interplay, showing that the use of special types of kernels yield models reminiscent of neural networks that are founded in the same theoretical framework of classical kernel methods, while enjoying many computational properties of deep neural networks.

An Efficient Empirical Solver for Localized Multiple Kernel Learning via DNNs

  • Ziming Zhang
  • Computer Science
    2020 25th International Conference on Pattern Recognition (ICPR)
  • 2021
This paper proposes parameterizing the gating function for learning kernel combination weights and the multiclass classifier using an attentional network (AN) and a multilayer perceptron (MLP), respectively, to help understand how the network solves the problem.

What Kinds of Functions do Deep Neural Networks Learn? Insights from Variational Spline Theory

A new function space, which is reminiscent of classical bounded variation spaces, that captures the compositional structure associated with deep neural networks is proposed, and a representer theorem is derived showing that deep ReLU networks are solutions to regularized data fitting problems in this function space.

A Unifying Representer Theorem for Inverse Problems and Machine Learning

  • M. Unser
  • Mathematics
    Found. Comput. Math.
  • 2021
A general representer theorem is presented that characterizes the solutions of a remarkably broad class of optimization problems and is used to retrieve a number of known results in the literature---e.g., the celebrated representser theorem of machine leaning for RKHS, Tikhonov regularization, representer theorems for sparsity promoting functionals, the recovery of spikes.

LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks

Overall LMKL-Net can not only outperform the state-of-the-art MKL solvers in terms of accuracy, but also be trained about two orders of magnitude faster with much smaller memory footprint for large-scale learning.

Learning rates for the kernel regularized regression with a differentiable strongly convex loss

The learning rates when the hypothesis RKHS's logarithmic complexity exponent is arbitrarily small as well as sufficiently large are provided and the robustness with the maximum mean discrepancy and the Hutchinson metric is shown.

A kernel‐expanded stochastic neural network

The so‐called kernel‐expanded stochastic neural network (K‐StoNet) model is proposed, which incorporates support vector regression as the first hidden layer and reformulates the neural network as a latent variable model, and possesses a theoretical guarantee to asymptotically converge to the global optimum.

A Kernel Perspective for the Decision Boundary of Deep Neural Networks

  • Yifan ZhangShizhong Liao
  • Computer Science
    2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI)
  • 2020
It is argued that the multi-layer nonlinear feature transformation in deep neural networks is equivalent to a kernel feature mapping and analyzed from the perspective of the unique mathematical advantages of kernel methods and the method of constructing multi- layer kernel machines.

Deep Spectral Kernel Learning

A novel deep spectral kernel network (DSKN) is proposed to naturally integrate nonstationary and non-monotonic spectral kernels into elegant deep architectures in an interpretable way, which can be further generalized to cover most kernels.



Kernel Methods for Deep Learning

A new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets are introduced that can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that the authors call multilayers kernel machines (MKMs).

Deep Kernel Learning

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs

On Learning Vector-Valued Functions

This letter provides a study of learning in a Hilbert space of vector-valued functions and derives the form of the minimal norm interpolant to a finite set of data and applies it to study some regularization functionals that are important in learning theory.

Deep Multiple Kernel Learning

This paper combines kernels at each layer and then optimize over an estimate of the support vector machine leave-one-out error rather than the dual objective function to improve performance on a variety of datasets.

Multiple kernel learning, conic duality, and the SMO algorithm

Experimental results are presented that show that the proposed novel dual formulation of the QCQP as a second-order cone programming problem is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.

Two-Layer Multiple Kernel Learning

This paper investigates a framework of Multi-Layer Multiple Kernel Learning that aims to learn “deep” kernel machines by exploring the combinations of multiple kernels in a multi-layer structure, which goes beyond the conventional MKL approach.

Deep multilayer multiple kernel learning

This paper proposes to optimize the network over an adaptive backpropagation MLMKL framework using the gradient ascent method instead of dual objective function, or the estimation of the leave-one-out error, and achieves high performance.

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

  • A. Atiya
  • Computer Science
    IEEE Transactions on Neural Networks
  • 2005
This book is an excellent choice for readers who wish to familiarize themselves with computational intelligence techniques or for an overview/introductory course in the field of computational intelligence.

Approximation capabilities of multilayer feedforward networks