Learn More
In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences(More)
MOTIVATION Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to(More)
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine(More)
The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different(More)
Mobile genetic elements are major contributing factors to the generation of genetic diversity in prokaryotic organisms. For example, insertion sequence (IS) elements have been shown to specifically contribute to niche adaptation by promoting a variety of genetic rearrangements. The complete genome sequence of the cheese culture Lactobacillus helveticus DPC(More)
InterPro amalgamates predictive protein signatures from a number of well-known partner databases into a single resource. To aid with interpretation of results, InterPro entries are manually annotated with terms from the Gene Ontology (GO). The InterPro2GO mappings are comprised of the cross-references between these two resources and are the largest source(More)
The recently sequenced genome of Lactobacillus helveticus DPC4571 [1] revealed a dairy organism with significant homology (75% of genes are homologous) to a probiotic bacteria Lb. acidophilus NCFM [2]. This led us to hypothesise that a group of genes could be determined which could define an organism's niche. Taking 11 fully sequenced lactic acid bacteria(More)
InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member(More)
Bifidobacterium pseudolongum subsp. globosum DPC479 is an intestinally-derived strain which contains a plasmid, pASV479, 4.8 kb in size. This plasmid has a G + C content of 59% and contains six open reading frames (ORFs), four of which are cryptic. The other two ORFs have 47% and 54% identity, respectively, to the replication and FtsK-like proteins found in(More)
The removal of annotation from biological databases is often perceived as an indicator of erroneous annotation. As a corollary, annotation stability is considered to be a measure of reliability. However, diverse data-driven events can affect the stability of annotations in both primary protein sequence databases and the protein family databases that are(More)