Learn More
BACKGROUND Recent innovations in sequencing technologies have provided researchers with the ability to rapidly characterize the microbial content of an environmental or clinical sample with unprecedented resolution. These approaches are producing a wealth of information that is providing novel insights into the microbial ecology of the environment and human(More)
With ever increasing storage demands and management costs, object based storage is on the verge of becoming the next standard storage interface. The American National Standards Institute (ANSI) ratified the object based storage interface standard (also referred to as OSD T-10) in January 2005. In this paper we present our experiences building a reference(More)
Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype-gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this(More)
With ever increasing storage demands and management costs, object based storage is on the verge of becoming the next standard storage interface. The American National Standards Institute (ANSI) ratified the object based storage interface standard (also referred to as OSD T-10) in January 2005. In this paper we present our experiences building a reference(More)
Malaria causes over one million deaths annually, posing an enormous health and economic burden in endemic regions. The completion of genome sequencing of the causative agents, a group of parasites in the genus Plasmodium, revealed potential drug and vaccine candidates. However, genomics-driven target discovery has been significantly hampered by our limited(More)
Recomputation of the previously evaluated similarity results between biological sequences becomes inevitable when researchers realize errors in their sequenced data or when the researchers have to compare nearly similar sequences, e.g., in a family of proteins. We present an efficient scheme for updating local sequence alignments with an affine gap model.(More)
The computational complexity of evaluating homologies between a gene sequence and profile Hidden Markov Models (HMMs) is relatively high. Unfortunately, researchers must re-evaluate matches every time they discover an error in a sequence or encounter a mutation of the sequence. Since these occurrences are frequent, it is desirable to have a low complexity(More)
Extended Abstract 1. Introduction Provenance is a well-established concept in the art world where the lineage, pedigree or origins of a painting are critical to determining its authenticity and value. It is of equal importance in present and future data-rich environments ranging from computational biology, intelligence information gathering, to high energy(More)
We present a novel approach for reducing the computational complexity of updating homologies produced by a wide class of popular state-of-the-art algorithms in comparative computational biology. The algorithms that we consider use hidden Markov models (HMMs) and a Viterbi recursion to evaluate matches between sequences, or between a sequence and models.(More)
Sequence alignment or decoding in molecular biology is mostly done via computationally expensive dynamic programming (DP) based approaches. Unfortunately, as sequencing errors are discovered frequently, researchers must repeat all previous similarity analysis for the erroneous sequence. This can take hours or days. In this work, we derive relative tolerance(More)