Corpus ID: 316473

On p-norm Path Following in Multiple Kernel Learning for Non-linear Feature Selection

@inproceedings{Jawanpuria2014OnPP,
  title={On p-norm Path Following in Multiple Kernel Learning for Non-linear Feature Selection},
  author={Pratik Jawanpuria and M. Varma and SakethaNath Jagarlapudi},
  booktitle={ICML},
  year={2014}
}
Our objective is to develop formulations and algorithms for efficiently computing the feature selection path - i.e. the variation in classification accuracy as the fraction of selected features is varied from null to unity. Multiple Kernel Learning subject to lp≤1 regularization (lp-MKL) has been demonstrated to be one of the most effective techniques for non-linear feature selection. However, state-of-the-art lp-MKL algorithms are too computationally expensive to be invoked thousands of times… Expand
Exploiting the structure of feature spaces in kernel learning
The problem of learning the optimal representation for a specific task recently became an important and not trivial topic in the machine learning community. In this field, deep architectures are theExpand
Multiple Graph-Kernel Learning
TLDR
A Multiple Kernel Learning (MKL) approach to learn different weights of different bunches of features which are grouped by complexity, and defines a notion of kernel complexity, namely Kernel Spectral Complexity, and shows how this complexity relates to the well-known Empirical Rademacher Complexity for a natural class of functions which include SVM. Expand
ℓ2, 1 Norm Regularized Multi-kernel Based Joint Nonlinear Feature Selection and Over-sampling for Imbalanced Data Classification
TLDR
The experimental results demonstrate that jointly operating nonlinear feature selection and oversampling with 2,1 norm multi-kernel learning framework (2,1 MKFSOS) can lead to a promising classification performance. Expand
Generalized hierarchical kernel learning
TLDR
A generic regularizer enables the proposed formulation of Hierarchical Kernel Learning to be employed in the Rule Ensemble Learning (REL) where the goal is to construct an ensemble of conjunctive propositional rules. Expand
A Sequential Learning Approach for Scaling Up Filter-Based Feature Subset Selection
TLDR
The proposed framework uses multiarm bandit algorithms to sequentially search a subset of variables, and assign a level of importance for each feature, allowing it to naturally scale to large data sets, evaluate such data in a very small amount of time, and be performed independently of the optimization of any classifier to reduce unnecessary complexity. Expand
Incorporating Distribution Matching into Uncertainty for Multiple Kernel Active Learning
TLDR
A multiple kernel active learning framework that incorporates a group regularizer of distribution information into the estimation of uncertainty and takes the advantage of multiple kernel learning to learn the kernel space in which the complex structures can be well captured by kernel weights is proposed. Expand
Learning Proximity Relations for Feature Selection
TLDR
A theoretical analysis of the generalization error of the proposed method is provided which validates the effectiveness of the method and demonstrates the success of the approach applying to feature selection. Expand
A Geometric Viewpoint of the Selection of the Regularization Parameter in Some Support Vector Machines
TLDR
This work proposes an algorithm that identifies neighbouring vertices of a given vertex and thereby identifies the classifiers corresponding to the set of Vertices of this polytope, and chooses a classifier based on a suitable test error criterion. Expand
Learning Kernels for Multiple Predictive Tasks
TLDR
This thesis presents a family of regularized risk minimization based convex formulations, of increasing generality, for learning features (kernels) in various settings involving multiple tasks and proposes a mixed-norm based formulation for learning the shared kernel as well as the prediction functions of all the tasks. Expand
Soft Kernel Target Alignment for Two-Stage Multiple Kernel Learning
TLDR
ALIGNF+, a soft version of ALIGNF, is proposed, based on the observation that the dual problem of ALignF is essentially a one-class SVM problem, and just requires an upper bound on the kernel weights of original AL IGNF. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 53 REFERENCES
l p -Norm Multiple Kernel Learning
Learning linear combinations of multiple kernels is an appealing strategy when the right choice of features is unknown. Previous approaches to multiple kernel learning (MKL) promote sparse kernelExpand
Multi Kernel Learning with Online-Batch Optimization
TLDR
This work presents a MKL optimization algorithm based on stochastic gradient descent that has a guaranteed convergence rate and introduces a p-norm formulation of MKL that controls the level of sparsity of the solution, leading to an easier optimization problem. Expand
More generality in efficient multiple kernel learning
TLDR
It is observed that existing MKL formulations can be extended to learn general kernel combinations subject to general regularization while retaining all the efficiency of existing large scale optimization algorithms. Expand
Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
  • F. Bach
  • Computer Science, Mathematics
  • NIPS
  • 2008
TLDR
The extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance. Expand
Multi-label Multiple Kernel Learning
TLDR
The proposed learning formulation leads to a non-smooth min-max problem, which can be cast into a semi-infinite linear program (SILP) and an approximate formulation with a guaranteed error bound which involves an unconstrained convex optimization problem. Expand
Ultra-Fast Optimization Algorithm for Sparse Multi Kernel Learning
TLDR
This paper introduces a novel MKL formulation, which mixes elements of p-norm and elastic-net kind of regularization, and proposes a fast stochastic gradient descent method that solves the novelMKL formulation. Expand
L2 Regularization for Learning Kernels
TLDR
This paper presents a novel theoretical analysis of the problem of learning kernels with the same family of kernels but with an L2 regularization instead, and gives learning bounds for orthogonal kernels that contain only an additive term O(√p/m) when compared to the standard kernel ridge regression stability bound. Expand
From Lasso regression to Feature vector machine
TLDR
A new approach named the Feature Vector Machine (FVM), which reformulates the standard Lasso regression into a form isomorphic to SVM, and this form can be easily extended for feature selection with non-linear models by introducing kernels defined on feature vectors. Expand
SPF-GMKL: generalized multiple kernel learning with a million kernels
TLDR
A Spectral Projected Gradient descent optimizer is developed which takes into account second order information in selecting step sizes, employs a non-monotone step size selection criterion requiring fewer function evaluations, is robust to gradient noise, and can take quick steps when far away from the optimum. Expand
Scalable training of L1-regularized log-linear models
TLDR
This work presents an algorithm Orthant-Wise Limited-memory Quasi-Newton (OWL-QN), based on L-BFGS, that can efficiently optimize the L1-regularized log-likelihood of log-linear models with millions of parameters. Expand
...
1
2
3
4
5
...