Learn More
A glycosylphosphatidylinositol (GPI) anchor is a common but complex C-terminal post-translational modification of extracellular proteins in eukaryotes. Here we investigate the problem of correctly annotating GPI-anchored proteins for the growing number of sequences in public databases. We developed a computational system, called FragAnchor, based on the(More)
The human prion gene contains five copies of a 24 nt repeat that is highly conserved among species. An analysis of folding free energies of the human prion mRNA, in particular in the repeat region, suggested biased codon selection and the presence of RNA patterns. In particular, pseudoknots, similar to the one predicted by Wills in the human prion mRNA,(More)
UNLABELLED Viruses have a profound influence on the ecology and evolution of plankton, but our understanding of the composition of the aquatic viral communities is still rudimentary. This is especially true of those viruses having RNA genomes. The limited data that have been published suggest that the RNA virioplankton is dominated by viruses with(More)
Viruses are abundant in the ocean and a major driving force in plankton ecology and evolution. It has been assumed that most of the viruses in seawater contain DNA and infect bacteria, but RNA-containing viruses in the ocean, which almost exclusively infect eukaryotes, have never been quantified. We compared the total mass of RNA and DNA in the viral(More)
BACKGROUND A large family of viruses that infect bacteria, called phages, is characterized by long tails used to inject DNA into their victims' cells. The tape measure protein got its name because the length of the corresponding gene is proportional to the length of the phage's tail: a fact shown by actually copying or splicing out parts of DNA in exemplar(More)
Phosphorylation is an important post-translational modification of proteins that is essential to the regulation of many cellular processes. Although most of the phosphorylation sites discovered in protein sequences have been identified experimentally, the <i>in vivo</i> and <i>in vitro</i> discovery of the sites is an expensive, time-consuming and laborious(More)
Classification is the problem of assigning each input object to one of a finite number of classes. This problem has been extensively studied in machine learning and statistics, and there are numerous applications to bioinformatics as well as many other fields. Building a multiclass classifier has been a challenge, where the direct approach of altering the(More)
BACKGROUND In environmental sequencing projects, a mix of DNA from a whole microbial community is fragmented and sequenced, with one of the possible goals being to reconstruct partial or complete genomes of members of the community. In communities with high diversity of species, a significant proportion of the sequences do not overlap any other fragment in(More)
Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of(More)
Genomes evolve with both mutations and large scale events, such as inversions, translocations, duplications and losses, that modify the structure of a set of chromosomes. In order to study these types of large-scale events, the first task is to select, in different genomes, sub-sequences that are considered " equivalent ". Many approaches have been used to(More)