Learn More
MOTIVATION An increasing body of literature shows that genomes of eukaryotes can contain clusters of functionally related genes. Most approaches to identify gene clusters utilize microarray data or metabolic pathway databases to find groups of genes on chromosomes that are linked by common attributes. A generalized method that can find gene clusters(More)
An important strategy to study genome evolution is to investigate the clustering of orthologous genes among multiple genomes, in which the most popular approaches require that the distance between adjacent genes in a cluster be small. We investigate a different formulation based on constraining the overall size of a cluster and develop statistical(More)
Two red algal classes, the Florideophyceae (approximately 7,100 spp.) and Bangiophyceae (approximately 193 spp.), comprise 98% of red algal diversity in marine and freshwater habitats. These two classes form well-supported monophyletic groups in most phylogenetic analyses. Nonetheless, the interordinal relationships remain largely unresolved, in particular(More)
Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not available for high-volume data processing, or(More)
Teleaulax amphioxeia is a photosynthetic unicellular cryptophyte alga that is distributed throughout marine habitats worldwide. This alga is an important plastid donor to the dinoflagellate Dinophysis caudata through the ciliate Mesodinium rubrum in the marine food web. To better understand the genomic characteristics of T. amphioxeia, we have sequenced and(More)
In the carotenoid biosynthetic pathway, lycopene β-cyclase (LCYB) catalyzes the cyclization that converts lycopene into β-carotene. Only a single copy of LCYB was identified and was suggested to encode a chromoplast-specific LCYB (CYCB type) in watermelon [Citrullus lanatus (Thunb.), Matsum & Nakai]. Splicing variants in the 5′-untranslated region were(More)
MapReduce is an effective tool for processing large amounts of data in parallel using a cluster of processors or computers. One common data processing task is the join operation, which combines two or more datasets based on values common to each. In this paper, we present a network aware multi-way join for MapReduce (SmartJoin) that improves performance and(More)
The power management issue has always been a critical concern in cloud computing for supporting rapid growth of data centers. In this paper, our strategy is to implement working vacation (WV) to lower and eliminate unnecessary power consumed by idle servers. Two green systems are first proposed where one implements a single WV and the other implements(More)