Zhihua Du

Learn More
Identification of groups of genes that manifest similar expression patters is a key step in the analysis of gene expression data. Hierarchical clustering is developed for that purpose. A fundamental problem with the previous implementations of this clustering method is its limitation to handle large data sets within a reasonable time and memory resources.(More)
Microarray technology has been widely applied in study of measuring gene expression levels for thousands of genes simultaneously. Gene cluster analysis is found useful for discovering the function of gene because co-expressed genes are likely to share the same biological function. K-means is one of well-known clustering methods. However, it is sensitive to(More)
Reconstruction of phylogenetic trees for very large datasets is a known example of a computationally hard problem. In this paper, we present a parallel computing model for the widely used Multiple Instruction Multiple Data (MIMD) architecture. Following the idea of divide-and-conquer, our model adapts the recursive-DCM3 decomposition method [Roshan, U.,(More)
KH (hnRNP K homology) domains, consisting of approximately 70 amino acid residues, are present in a variety of nucleic-acid-binding proteins. Among these are poly(C)-binding proteins (PCBPs), which are important regulators of mRNA stability and posttranscriptional regulation in general. All PCBPs contain three different KH domains and recognize(More)
Multiple sequence alignment (MSA) is one of the fundamental research topics in computational biology. The alignments help us to find functional assignment, evolutionary history and conserved region. Previous methods use a substitution matrix and do not incorporate knowledge of the sequences being aligned. Therefore, they do not assure the alignment of(More)
Mutational and NMR methods were used to investigate features of sequence, structure, and dynamics that are associated with the ability of a pseudoknot to stimulate a -1 frameshift. In vitro frameshift assays were performed on retroviral gag-pro frameshift-stimulating pseudoknots and their derivatives, a pseudoknot from the gene 32 mRNA of bacteriophage T2(More)
Hierarchical clustering is the most often used method for grouping similar patterns of gene expression data. A fundamental problem with existing implementations of this clustering method is the inability to handle large data sets within a reasonable time and memory resources. We propose a parallelized algorithm of hierarchical clustering to solve this(More)