InterPro in 2017—beyond protein family and domain annotations
Recent developments with InterPro are reported, including the addition of two new databases, and the functionality to include residue-level annotation and prediction of intrinsic disorder, which enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.
InterPro in 2019: improving coverage, classification and access to protein sequence annotations
Recent developments with InterPro (version 70.0) and its associated software are reported, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website.
A large-scale evaluation of computational protein function prediction
Today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets, and there is considerable need for improvement of currently available tools.
Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies
The results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized and strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannation.
BayGenomics: a resource of insertional mutations in mouse embryonic stem cells
The BayGenomics gene-trap resource (http://baygenomics.ucsf.edu) provides researchers with access to thousands of mouse embryonic stem (ES) cell lines harboring characterized insertional mutations in…
An expanded evaluation of protein function prediction methods shows an improvement in accuracy
The second critical assessment of functional annotation (CAFA) conducted, a timed challenge to assess computational methods that automatically assign protein function, revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies.
Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies
It is shown that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers in protein superfamilies, and sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.
Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies.
The protein sequence and structure databases are now sufficiently representative that strategies nature uses to evolve new catalytic functions can be identified and may provide the basis for discovering the functions of proteins and enzymes in new genomes as well as provide guidance for in vitro evolution/engineering of new enzymes.
The Structure–Function Linkage Database
The Structure‐Function Linkage Database (SFLD) provides highly curated information about the relationships between protein structure and function, using a superfamily‐centric organization to allow users to easily investigate how conserved folds and active sites are able to perform a wide variety of chemical reactions.
Divergence of function in the thioredoxin fold suprafamily: evidence for evolution of peroxiredoxins from a thioredoxin-like ancestor.
Using the Shotgun program, it is found that sequences of reductases involved in maturation of cytochromes in certain bacteria bridge the sequences of thioredoxins and peroxiredoxins, providing further support for an evolutionary relationship between these proteins.