• Corpus ID: 1513614

Large Scale Multiple Kernel Learning

  title={Large Scale Multiple Kernel Learning},
  author={S{\"o}ren Sonnenburg and Gunnar R{\"a}tsch and Christin Sch{\"a}fer and Bernhard Sch{\"o}lkopf},
  journal={J. Mach. Learn. Res.},
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to… 

A General and Efficient Multiple Kernel Learning Algorithm

The formulation and method can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations and generalized to a larger class of problems, including regression and one-class classification.

Y.: SimpleMKL

This paper proposes an algorithm, named SimpleMKL, for solving this MKL problem and provides a new insight on MKL algorithms based on mixed-norm regularization by showing that the two approaches are equivalent.

Building Sparse Multiple-Kernel SVM Classifiers

Experiments on a large number of toy and real-world data sets show that the resultant classifier is compact and accurate, and can also be easily trained by simply alternating linear program and standard SVM solver.

More generality in efficient multiple kernel learning

It is observed that existing MKL formulations can be extended to learn general kernel combinations subject to general regularization while retaining all the efficiency of existing large scale optimization algorithms.

An efficient multiple-kernel learning for pattern classification

Multiple kernel learning based on local and nonlinear combinations

A new MKL method is proposed, which is based on a local and nonlinear combination of different kernels using a gating model for selecting the appropriate kernel function and has performed better than the other methods analyzed.

Learning SVM with Complex Multiple Kernels Evolved by Genetic Programming

The numerical experiments show that the SVM involving the evolutionary complex multiple kernels perform better than the classic simple kernels and on the considered data sets, the new multiple kernels outperform both the cLMK and eLMK — linear multiple kernels.

Multiple kernel learning using nonlinear lasso

A novel MKL model based on a nonlinear Lasso, that is, the Hilbert–Schmidt independence criterion (HSIC) Lasso is developed, which has a clear statistical interpretation that minimum redundant kernels with maximum dependence on output labels are found and combined.

Spectral Projected Gradient Descent for Efficient and Large Scale Generalized Multiple Kernel Learni

This work addresses the problem of learning the kernel in a Support Vector Machine framework from training data by developing a Spectral Projected Gradient descent optimizer which takes into account second order information in selecting step sizes and employs a nonmonotone step size selection criterion requiring fewer function evaluations.

Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming

This paper proposes a Quadratically Constrained Quadratic Programming (QCQP) formulation for the kernel learning problem, which can be solved more efficiently than SDP and shows that the QCQP formulation can be extended naturally to the multi-class case.



Multiple kernel learning, conic duality, and the SMO algorithm

Experimental results are presented that show that the proposed novel dual formulation of the QCQP as a second-order cone programming problem is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.

Column-generation boosting methods for mixture of kernels

A boosting approach to classification and regression based on column generation using a mixture of kernels, which produces sparser solutions, and thus significantly reduces the testing time and is able to scale CG boosting to large datasets.

MARK: a boosting algorithm for heterogeneous kernel models

This work proposes the Multiple Additive Regression Kernels (MARK) algorithm, which considers a large (potentially infinite) library of kernel matrices formed by different kernel functions and parameters and investigates how MARK is applied to heterogeneous kernel ridge regression.

Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

Making large scale SVM learning practical

This chapter presents algorithmic and computational results developed for SVM light V 2.0, which make large-scale SVM training more practical and give guidelines for the application of SVMs to large domains.

Learning Interpretable SVMs for Biological Sequence Classification

Novel and efficient algorithms are proposed for solving the so-called Support Vector Multiple Kernel Learning problem and can be used to understand the obtained support vector decision function in order to extract biologically relevant knowledge about the sequence analysis problem at hand.

Large scale genomic sequence SVM classifiers

This work study two recently proposed and successfully used kernels, namely the Spectrum kernel and the Weighted Degree kernel, and suggests several extensions using Suffix Trees and modifications of an SMO-like SVM training algorithm in order to accelerate the training of the SVMs and their evaluation on test sequences.

Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces

There exists an optimal solution to the infinite hypothesis space problem consisting of a finite number of hypothesis, and two algorithms for solving the infinite and finite hypothesis problems are proposed.

A statistical framework for genomic data fusion

This paper describes a computational framework for integrating and drawing inferences from a collection of genome-wide measurements represented via a kernel function, which defines generalized similarity relationships between pairs of entities, such as genes or proteins.

The Spectrum Kernel: A String Kernel for SVM Protein Classification

A new sequence-similarity kernel, the spectrum kernel, is introduced for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem and performs well in comparison with state-of-the-art methods for homology detection.