A novel feature selection techniques based on contrast set mining

Abstract

Data classification is a challenging task in era of big data due to high number of features. Feature selection is a step in process of knowledge discovery in data that aims to reduce dimensionality and improve the classification performance. The purpose of this research is to define new techniques for feature selection in order to improve classification accuracy and reduce the time required for feature selection. The subject of the research is an application and evaluation of contrast set mining techniques as techniques for feature selection. The extensive comparison with benchmarking feature selection techniques is conducted on 128 data sets with the aim to determine can we use contrast set mining techniques as a superior feature selection techniques and whether they can eliminate the bottleneck of the entire process of knowledge discovery in data. Results of the 1792 analysis showed that in the more than 80% of the 128 analyzed data sets contrast set mining techniques resulted with more accurate classification and quickly performed feature selection than benchmarking feature selection techniques. Key-Words: Contrast set mining, Feature selection, STUCCO, Magnum Opus, Data mining comparative analysis, neural networks, classification

7 Figures and Tables

Cite this paper

@inproceedings{Oreski2015ANF, title={A novel feature selection techniques based on contrast set mining}, author={Dijana Oreski and Bozidar Klicek}, year={2015} }