Feature Cluster Selection for High-Throughput Data Analysis

Abstract

Feature selection is effective in selecting predictive gene sets for microarray classification. However, the large number of predictive gene sets and the disparity among them presents a challenge for identifying potential biomarkers. To facilitate biomarker identification, we present a new data mining task, feature cluster selection, which selects from a full set of features a small number of coherent and predictive feature clusters. We provide both theoretical definition and empirical formulation for the new problem, and propose an efficient 3M algorithm. Experiments on microarray data have shown that the 3M algorithm can select predictive and statistically significant gene clusters.

DOI: 10.1109/BIBM.2007.33

Cite this paper

@article{Yu2007FeatureCS, title={Feature Cluster Selection for High-Throughput Data Analysis}, author={Lei Yu and Hao Li}, journal={International journal of data mining and bioinformatics}, year={2007}, volume={3 2}, pages={177-91} }