Scalable Kernel Learning Via the Discriminant Information

  title={Scalable Kernel Learning Via the Discriminant Information},
  author={Mert Al and Zejiang Hou and S. Y. Kung},
  journal={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  • Mert Al, Zejiang Hou, S. Kung
  • Published 23 September 2019
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Kernel approximation methods create explicit, low-dimensional kernel feature maps to deal with the high computational and memory complexity of standard techniques. This work studies a supervised kernel learning methodology to optimize such mappings. We utilize the Discriminant Information criterion, a measure of class separability with a strong connection to Discriminant Analysis. By generalizing this measure to cover a wider range of kernel maps and learning settings, we develop scalable… 

Figures and Tables from this paper

Privacy Enhancing Machine Learning via Removal of Unwanted Dependencies
New variants of supervised and adversarial learning methods, which remove the sensitive information in the data before they are sent out for a particular application are studied, which can successfully maintain the utility performances of predictive models while causing sensitive predictions to perform poorly.


Random Features for Large-Scale Kernel Machines
Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.
Scalable Kernel Methods via Doubly Stochastic Gradients
An approach that scales up kernel methods using a novel concept called "doubly stochastic functional gradients" based on the fact that many kernel methods can be expressed as convex optimization problems, which can readily scale kernel methods up to the regimes which are dominated by neural nets.
A la Carte - Learning Fast Kernels
This work introduces a family of fast, flexible, lightly parametrized and general purpose kernel learning methods, derived from Fastfood basis function expansions, and provides mechanisms to learn the properties of groups of spectral frequencies in these expansions.
Nonlinear Discriminant Analysis Using Kernel Functions
The presented algorithm allows a simple formulation of the EM-algorithm in terms of kernel functions which leads to a unique concept for unsupervised mixture analysis, supervised discriminant analysis and semi-supervised discriminantAnalysis with partially unlabelled observations in feature spaces.
Ensemble Nystrom Method
A new family of algorithms based on mixtures of Nystrom approximation, ensemble Nystrom algorithms, that yield more accurate low-rank approximations than the standard Nystrom method are introduced.
How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets
This work develops methods to scale up kernel models to successfully tackle large-scale learning problems that are so far only approachable by deep learning architectures, and conducts extensive empirical studies on problems from image recognition and automatic speech recognition.
Revisiting the Nystrom Method for Improved Large-scale Machine Learning
An empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices and a suite of worst-case theoretical bounds for both random sampling and random projection methods are complemented.
Improved Nyström low-rank approximation and error analysis
An error analysis that directly relates the Nyström approximation quality with the encoding powers of the landmark points in summarizing the data is presented, and the resultant error bound suggests a simple and efficient sampling scheme, the k-means clustering algorithm, for NyStröm low-rank approximation.
Convolutional Kernel Networks
This paper proposes a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel, and bridges a gap between the neural network literature and kernels, which are natural tools to model invariance.
Kernel Methods and Machine Learning
This chapter discusses kernel methods for estimation, prediction, and system identification, as well as kNN, PNN, and Bayes classifiers, and their applications in machine learning and cluster discovery.