Feature subset selection with cumulate conditional mutual information minimization

Abstract

Feature selection is one of the core issues in designing pattern recognition and machine learning systems, and has attracted considerable attention in the literature. In this paper, a new feature subset selection algorithm with conditional mutual information is proposed, which firstly guarantees to find a subset of which the mutual information with the class is the same as that of the original set of features, and then eliminates potential redundant features from the view of minimal information loss based on the cumulate conditional mutual information minimization criterion. From the reliability point of view, this criterion can also abate the disturbance caused by sample insufficiency in conditional mutual information estimation. In addition, a fast implementation of conditional mutual information estimation is proposed and used to tackle the computationally intractable problem. Empirical results verify that our algorithm is efficient and achieves better accuracy than several representative feature selection algorithms for three typical classifiers on various datasets. 2011 Elsevier Ltd. All rights reserved.

DOI: 10.1016/j.eswa.2011.12.003

10 Figures and Tables

Showing 1-10 of 44 references