• Corpus ID: 5885974

Feature Selection via Concave Minimization and Support Vector Machines

@inproceedings{Bradley1998FeatureSV,
  title={Feature Selection via Concave Minimization and Support Vector Machines},
  author={Paul S. Bradley and Olvi L. Mangasarian},
  booktitle={ICML},
  year={1998}
}
Computational comparison is made between two feature selection approaches for nding a separating plane that discriminates between two point sets in an n-dimensional feature space that utilizes as few of the n features (dimensions) as possible. In the concave minimization approach [19, 5] a separating plane is generated by minimizing a weighted sum of distances of misclassi ed points to two parallel planes that bound the sets and which determine the separating plane midway between them… 

Figures and Tables from this paper

Generalized Support Vector Machines
By setting apart the two functions of a support vector machine: separation of points by a nonlinear surface in the original space of patterns, and maximizing the distance between separating planes in
Data selection for support vector machine classifiers
TLDR
The proposed approach incorporates a feature selection procedure that results in a minimal number of input features used by the classifier, which makes MSVM a useful incremental classification tool which maintains only a small fraction of a large dataset before merging and processing it with new incoming data.
Mathematical programming approaches to machine learning and data mining
TLDR
The feature selection approach via concave minimization computes a separating-plane based classifier that improves upon the generalization ability of a separating plane computed without feature suppression, support the claim that mathematical programming is effective as the basis of data mining tools to extract patterns from a database which contain “knowledge” and thus achieve “ knowledge discovery in databases”.
Feature Selection for Nonlinear Kernel Support Vector Machines
  • O. Mangasarian, Gang Kou
  • Computer Science
    Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007)
  • 2007
TLDR
An easily implementable mixed-integer algorithm is pro- posed that generates a nonlinear kernel support vector ma- chine (SVM) classifier with reduced input space features and improves theacy of a full-feature classifier by over 30%.
Integrated classifier hyperplane placement and feature selection
Semi-superyised support vector machines for unlabeled data classification
TLDR
Computational results show that clustered concave minimization yields test set improvement as high as 20.4% over a linear support vector machine trained on a correspondingly small but randomly chosen subset that is labeled by an expert.
Support vector machine classification via parameterless robust linear programming
TLDR
It is shown that the problem of minimizing the sum of arbitrary-norm real distances to misclassified points, from a pair of parallel bounding planes of a classification problem, leads to a simple parameterless linear program.
Minimal Kernel Classifiers
TLDR
A finite concave minimization algorithm is proposed for constructing kernel classifiers that use a minimal number of data points both in generating and characterizing a classifier and results in a much faster classifier that requires less storage.
Feature selection combining linear support vector machines and concave optimization
TLDR
This work proposes a feature selection strategy based on the combination of support vector machines (for obtaining good classifiers) with a concave optimization approach (for finding sparse solutions) and reports results of an extensive computational experience showing the efficiency of the proposed methodology.
Benchmarking Least Squares Support Vector Machine Classifiers
TLDR
Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances, consistently very good when compared to a variety of methods described in the literature.
...
...

References

SHOWING 1-10 OF 33 REFERENCES
Arbitrary-norm separating plane
Robust linear programming discrimination of two linearly inseparable sets
TLDR
A single linear programming formulation is proposed which generates a plane that of minimizes an average sum of misclassified points belonging to two disjoint points sets in n-dimensional real space, without the imposition of extraneous normalization constraints that inevitably fail to handle certain cases.
An Equivalence Between Sparse Approximation and Support Vector Machines
  • F. Girosi
  • Computer Science
    Neural Computation
  • 1998
TLDR
If the data are noiseless, the modified version of basis pursuit denoising proposed in this article is equivalent to SVM in the following sense: if applied to the same data set, the two techniques give the same solution, which is obtained by solving the same quadratic programming problem.
Toward Optimal Feature Selection
TLDR
An efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion is given, showing that the algorithm effectively handles datasets with a very large number of features.
A support vector machine approach to decision trees
  • Kristin P. Bennett, J. A. Blue
  • Computer Science
    1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227)
  • 1998
TLDR
The "optimal" decision tree is characterized, and both a primal and dual space formulation for constructing the tree are proposed and the result is a method for generating logically simple decision trees with multivariate linear, nonlinear or linear decisions.
Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms
TLDR
This article reviews five approximate statistical tests for determining whether one learning algorithm outperforms another on a particular learning task and measures the power (ability to detect algorithm differences when they do exist) of these tests.
Parsimonious Least Norm Approximation
TLDR
Numerical tests on a signal-processing-based example indicate that the proposed method is comparable to a method that parametrically minimizes the 1-norm of the solution x and the error ‖Ax-b-p‖1, and that both methods are superior, by orders of magnitude, to solutions obtained by least squares.
Readings in Machine Learning
TLDR
Readings in Machine Learning collects the best of the published machine learning literature, including papers that address a wide range of learning tasks, and that introduce a variety of techniques for giving machines the ability to learn.
Introduction to the theory of neural computation
TLDR
This book is a detailed, logically-developed treatment that covers the theory and uses of collective computational networks, including associative memory, feed forward networks, and unsupervised learning.
...
...