Mining Association Rule Bases from Integrated Genomic Data and Annotations

Abstract

During the last decade, several clustering and association rule mining techniques have been applied to identify groups of co-regulated genes in gene expression data. Nowadays, integrating biological knowledge and gene expression data into a single framework has become a major challenge to improve the relevance of mined patterns and simplify their interpretation by the biologists. The GenMiner approach was developed for mining association rules showing gene groups that are both co-expressed (sharing similar expression profiles) and co-annotated (sharing the same annotations such as function, regulatory mechanism, etc.) from such integrated datasets. It combines a new nomalized discretization method, called NorDi, and the Close algorithm to extract minimal non-redundant association rules only. Compared with classical Apriori based approaches, GenMiner improves the extraction applicability for these datasets and reduces the number of association rules by suppressing redundant rules that are uninformative and useless. We present a new Java implementation of GenMiner and experimental results obtained from microarray datasets with integrated biological knowledge (bio-ontologies, descriptions of regulation pathways and literature). These results show that GenMiner requires less memory than Apriori based approaches and that it improves the relevance of extracted rules. Moreover, association rules obtained revealed significant co-annotated and co-expressed gene patterns showing important biological relationships supported by recent biological literature.

DOI: 10.1007/978-3-642-02504-4_7

Extracted Key Phrases

Cite this paper

@inproceedings{Martnez2008MiningAR, title={Mining Association Rule Bases from Integrated Genomic Data and Annotations}, author={Ricardo Mart{\'i}nez and Nicolas Pasquier and Claude Pasquier}, booktitle={CIBB}, year={2008} }