Feature Selection for Classification: A Review
@inproceedings{Tang2014FeatureSF, title={Feature Selection for Classification: A Review}, author={Jiliang Tang and Salem Alelyani and Huan Liu}, booktitle={Data Classification: Algorithms and Applications}, year={2014} }
Nowadays, the growth of the high-throughput technologies has resulted in exponential growth in the harvested data with respect to both dimensionality and sample size. The trend of this growth of the UCI machine learning repository is shown in Figure 1. Efficient and effective management of these data becomes increasing challenging. Traditionally manual management of these datasets to be impractical. Therefore, data mining and machine learning techniques were developed to automatically discover…Â
Figures from this paper
982 Citations
Introduction to Feature Selection
- Computer ScienceUnderstanding and Using Rough Set Based Feature Selection: Concepts, Techniques and Applications
- 2019
In this chapter, necessary preliminaries of feature selection are discussed, which lets us select only relevant data that the authors can use on behalf of the entire dataset.
A Review on Dimensionality Reduction Techniques
- Computer Science
- 2017
This paper analyses some existing popular feature selection and feature extraction techniques and addresses benefits and challenges of these algorithms which would be beneficial for beginners.
A Novel Feature Selection Method Based on Clustering
- Computer Science
- 2019
A feature selection method based on the mean shift clustering algorithm and the Pearson correlation coefficient is proposed to contribute to solving some of the challenges in the data analytics systems, of real-time execution.
Feature Selection using Genetic Programming
- Computer Science
- 2019
This paper investigates the ability of Genetic Programming (GP), an evolutionary algorithm searching strategy capable of automatically finding solutions in complex and large search spaces, to perform feature selection and shows that not only does GP select a smaller set of features from the original features, classifiers using GP selected features achieve a better classification performance than using all the original Features.
A New Intelligent Hybrid Feature Selection Method
- Computer Science2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA)
- 2018
A new hybrid feature selection method is introduced and evaluated against ten datasets form UCI repository and experimental results show that the classifier adopted to the experiment has achieved better classification accuracy when compared with the other version that used a single feature selection methods.
Feature selection techniques in the context of big data: taxonomy and analysis
- Computer ScienceApplied Intelligence
- 2022
A comprehensive review of the latest FS approaches in the context of big data along with a structured taxonomy, which categorizes the existing methods based on their nature, search strategy, evaluation process, and feature structure and highlights the research issues and open challenges related to FS.
Evaluating Feature Selection Robustness on High-Dimensional Data
- Computer ScienceHAIS
- 2018
The robustness of some state-of-the-art selection methods, for different levels of data perturbation and different cardinalities of the selected feature subsets are analyzed.
A Review of Grey Wolf Optimizer-Based Feature Selection Methods for Classification
- Computer ScienceAlgorithms for Intelligent Systems
- 2019
This book chapter provides a brief review of the latest works on feature selection using GWO, of which grey wolf optimizer (GWO) is a recent one.
References
SHOWING 1-10 OF 88 REFERENCES
A review of feature selection techniques in bioinformatics
- BiologyBioinform.
- 2007
A basic taxonomy of feature selection techniques is provided, providing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.
The Effect of the Characteristics of the Dataset on the Selection Stability
- Computer Science2011 IEEE 23rd International Conference on Tools with Artificial Intelligence
- 2011
This work conducts an extensive experimental study using verity of data sets and different well-known feature selection algorithms in order to study the behavior of these algorithms in terms of the stability.
Online Feature Selection and Its Applications
- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2014
This article investigates the problem of online feature selection (OFS) in which an online learner is only allowed to maintain a classifier involved only a small and fixed number of features, and presents novel algorithms to solve each of the two problems.
Feature Selection for Knowledge Discovery and Data Mining
- Computer ScienceThe Springer International Series in Engineering and Computer Science
- 1998
Feature Selection for Knowledge Discovery and Data Mining offers an overview of the methods developed since the 1970's and provides a general framework in order to examine these methods and categorize them and suggests guidelines for how to use different methods under various circumstances.
Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution
- Computer ScienceICML
- 2003
A novel concept, predominant correlation, is introduced, and a fast filter method is proposed which can identify relevant features as well as redundancy among relevant features without pairwise correlation analysis.
An Introduction to Variable and Feature Selection
- Computer ScienceJ. Mach. Learn. Res.
- 2003
The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Filter versus wrapper gene selection approaches in DNA microarray domains
- Computer ScienceArtif. Intell. Medicine
- 2004
Unsupervised Feature Selection Using Feature Similarity
- Computer ScienceIEEE Trans. Pattern Anal. Mach. Intell.
- 2002
An unsupervised feature selection algorithm suitable for data sets, large in both dimension and size, based on measuring similarity between features whereby redundancy therein is removed, which does not need any search and is fast.