Learn More
In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the 'twilight zone' at 10-20% residue(More)
Multiple sequence alignment is one of the cornerstones of modern molecular biology. It is used to identify conserved motifs, to determine protein domains, in 2D/3D structure prediction by homology and in evolutionary studies. Recently, high-throughput technologies such as genome sequencing and structural proteomics have lead to an explosion in the amount of(More)
BAliBASE is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The database contains high quality, manually constructed multiple sequence alignments together with detailed annotations. The alignments are all based on three-dimensional structural superpositions, with the(More)
MOTIVATION Most multiple sequence alignment programs use heuristics that sometimes introduce errors into the alignment. The most commonly used methods to correct these errors use iterative techniques to maximize an objective function. We present here an alternative, knowledge-based approach that combines a number of recently developed methods into a(More)
Multiple sequence alignment is a fundamental tool in a number of different domains in modern molecular biology, including functional and evolutionary studies of a protein family. Multiple alignments also play an essential role in the new integrated systems for genome annotation and analysis. Thus, the development of new multiple alignment scores and(More)
BACKGROUND The Gene Ontology (GO) is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure,(More)
Four consensus sequences are conserved with the same linear arrangement in RNA-dependent DNA polymerases encoded by retroid elements and in RNA-dependent RNA polymerases encoded by plus-, minus- and double-strand RNA viruses. One of these motifs corresponds to the YGDD span previously described by Kamer and Argos (1984). These consensus sequences altogether(More)
The peroxisome is an essential eukaryotic organelle, crucial for lipid metabolism and free radical detoxification, development, differentiation, and morphogenesis from yeasts to humans. Loss of peroxisomes invariably leads to fatal peroxisome biogenesis disorders in man. The evolutionary origin of peroxisomes remains unsolved; proposals for either a(More)
The sequence of Rift Valley fever virus L segment that we published in a previous paper was erroneous in the 3'-terminal region of the antigenomic RNA molecule. Here, we have shown that the L segment is in fact 6404 nucleotides long and encodes a polypeptide of 237.7K in the viral complementary sense. Sequence comparisons performed between the RNA-dependent(More)