Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs

Abstract

Gene expression data generated by DNA microarray experiments provide a vast resource of medical diagnostic and disease understanding. Unfortunately, the large amount of data makes it hard, sometimes even impossible, to understand the correct behavior of genes. In this work, we develop a possibilistic approach for mining gene microarray data. Our model consists of two steps. In the first step, we use possibilistic clustering to partition the data into groups (or clusters). The optimal number of clusters is evaluated automatically from the data using the Information Entropy as a validity measure. In the second step, we select from each computed cluster the most representative genes and model them as a graph called a proximity graph. This set of graphs (or hyper-graph) will be used to predict the function of new and previously unknown genes. Experimental results using real-world data sets reveal a good performance and a high prediction accuracy of our model.

DOI: 10.1007/s10489-009-0161-3

8 Figures and Tables

Cite this paper

@article{Romdhane2009MiningMG, title={Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs}, author={Lotfi Ben Romdhane and Hechmi Shili and B{\'e}chir el Ayeb}, journal={Applied Intelligence}, year={2009}, volume={33}, pages={220-231} }