CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis

Abstract

Novel DNA mlcroarray technologies enable the monitoring of expression levels of thousands of genes simultaneously. This allows a global view on the transcription levels of many (or all) genes when the cell undergoes specific conditions or processes. Analyzing gene expression data requires the clustering of genes into groups with similar expression patterns. We have developed a novel clustering algorithm, called CLICK, which is applicable to gene expression analysis as well as to other biological applications. No prior assumptions are made on the structure or the number of the clusters. The algorithm utilizes graph-theoretic and statistical techniques to identify tight groups of highly similar dements (kernels), which are likely to belong the same true cluster. Several heuristic procedures are then used to expand the kernels into the full clustering. CLICK has been implemented and tested on a variety of biological datasets, ranging from gene expression, eDNA ollgo-fmgerprinting to protein sequence similarity. In all those applications it outperformed extant algorithms according to several common figures of merit. CLICK is also very fast, allowing clustering of thousands of elements in minutes, and over 100,000 elements in a couple of hours on a regular workstation.

Extracted Key Phrases

11 Figures and Tables

0204060'01'03'05'07'09'11'13'15'17
Citations per Year

598 Citations

Semantic Scholar estimates that this publication has 598 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Sharan2002CLICKAC, title={CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis}, author={Roded Sharan and Ron Shamir}, year={2002} }