Learn More
MOTIVATION An increasing body of literature shows that genomes of eukaryotes can contain clusters of functionally related genes. Most approaches to identify gene clusters utilize microarray data or metabolic pathway databases to find groups of genes on chromosomes that are linked by common attributes. A generalized method that can find gene clusters(More)
Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not available for high-volume data processing, or(More)
An important strategy to study genome evolution is to investigate the clustering of orthologous genes among multiple genomes, in which the most popular approaches require that the distance between adjacent genes in a cluster be small. We investigate a different formulation based on constraining the overall size of a cluster and develop statistical(More)
Two red algal classes, the Florideophyceae (approximately 7,100 spp.) and Bangiophyceae (approximately 193 spp.), comprise 98% of red algal diversity in marine and freshwater habitats. These two classes form well-supported monophyletic groups in most phylogenetic analyses. Nonetheless, the interordinal relationships remain largely unresolved, in particular(More)
In the carotenoid biosynthetic pathway, lycopene β-cyclase (LCYB) catalyzes the cyclization that converts lycopene into β-carotene. Only a single copy of LCYB was identified and was suggested to encode a chromoplast-specific LCYB (CYCB type) in watermelon [Citrullus lanatus (Thunb.), Matsum & Nakai]. Splicing variants in the 5′-untranslated region were(More)
Teleaulax amphioxeia is a photosynthetic unicellular cryptophyte alga that is distributed throughout marine habitats worldwide. This alga is an important plastid donor to the dinoflagellate Dinophysis caudata through the ciliate Mesodinium rubrum in the marine food web. To better understand the genomic characteristics of T. amphioxeia, we have sequenced and(More)
The power management issue has always been a critical concern in cloud computing for supporting rapid growth of data centers. In this paper, our strategy is to implement working vacation (WV) to lower and eliminate unnecessary power consumed by idle servers. Two green systems are first proposed where one implements a single WV and the other implements(More)
Identifying genomic regions that descended from a common ancestor helps us study the gene function and genome evolution. In distantly related genomes, clusters of homologous gene pairs are evidently used in function prediction, operon detection, etc. Currently, there are many kinds of computational methods that have been proposed defining gene clusters to(More)
MapReduce is an effective tool for processing large amounts of data in parallel using a cluster of processors or computers. One common data processing task is the join operation, which combines two or more datasets based on values common to each. In this paper, we present a network aware multi-way join for MapReduce (SmartJoin) that improves performance and(More)
We sequenced and characterized the first complete mitochondrial genome of the sublittoral red alga Rhodymenia pseudopalmata (Rhodymeniales, Rhodophyta). The mitogenome is 26,166 bp in length with 29.5% GC content. The circular mitogenome contains 47 genes, including 24 protein-coding, 2 rRNA and 21 tRNA genes including two copies of trnG, trnL, trnM and(More)