Mapping the Protein Universe

  title={Mapping the Protein Universe},
  author={Liisa Holm and Chris Sander},
  pages={595 - 602}
The comparison of the three-dimensional shapes of protein molecules poses a complex algorithmic problem. Its solution provides biologists with computational tools to organize the rapidly growing set of thousands of known protein shapes, to identify new types of protein architecture, and to discover unexpected evolutionary relations, reaching back billions of years, between protein molecules. Protein shape comparison also improves tools for identifying gene functions in genome databases by… 
Geometric and topological methods in protein structure analysis
This thesis describes efficient computational methods for describing and comparing molecular structures by combining both geometric and topological approaches and describes an efficient algorithm to find promising initial relative placements of the proteins.
Expanding protein universe and its origin from the biological Big Bang
This work has discovered that the universe of protein structures is organized hierarchically into a scale-free network and attempts to glance at the very origin of life.
Comparing and modeling protein structure
The first part of this work focuses on protein structural alignment, namely, the comparison of two structures and formalizes this problem as the optimization of a geometric similarity score over the space of rigid body transformations, leading to an approximate polynomial time alignment algorithm.
Representation of the Protein Universe using Classifications, Maps, and Networks
Different protein qualities were revealed in each study; many point out the uniqueness of domains of the alpha/beta SCOP (structural classification of proteins) class.
The structure of the protein universe and genome evolution
These findings suggest that genome evolution is driven by extremely general mechanisms based on the preferential attachment principle, and that protein folds and families encoded in diverse genomes show similar size distributions with notable mathematical properties.
Protein folds, functions and evolution.
The evolution of proteins and their functions is reviewed from a structural perspective in the light of the current database, finding that the number of new topologies is still increasing, although 25 new structures are now determined for each new topology.
Local Structure Comparison of Proteins
Algorithms exploiting the chain structure of proteins
Three algorithms addressing fundamental problems in computational structural biology are presented, including an automatic method for completing partial models of protein structures resolved using X-ray crystallography, and a method for speeding up Monte Carlo simulation of proteins.
Large-scale protein structure modeling of the Saccharomyces cerevisiae genome.
  • R. SánchezA. Sali
  • Biology, Engineering
    Proceedings of the National Academy of Sciences of the United States of America
  • 1998
The fold assignment, comparative protein structure modeling, and model evaluation were automated completely and resulted in all-atom 3D models for substantial segments of 1,071 of the yeast proteins, only 40 of which have had their 3D structure determined experimentally.


Detection of common three‐dimensional substructures in proteins
A fully automatic algorithm for three‐dimensional alignment of protein structures and for the detection of common substructures and structural repeats is presented, so fast that structural comparison of a single protein with the entire database of known protein structures can be performed routinely on a workstation.
Parser for protein folding units
An algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions is proposed, which is useful for the analysis of folding principles, for modular protein design and for protein engineering.
The anatomy and taxonomy of protein structure.
Structural superposition of proteins with unknown alignment and detection of topological similarity using a six‐dimensional search algorithm
The algorithm is shown to find the best superposition of distantly related structures, and to be capable of finding similar structures to a given atomic model in the Brookhaven Protein Data Bank.
A structural census of the current population of protein sequences.
  • M. GersteinM. Levitt
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1997
Overall, it is found that an appreciable fraction of the known folds are present in each of the major groups of organisms, and most of the common folds are associated with many families of nonhomologous sequences, although some of the most common folds in vertebrates, such as globins or zinc fingers, are rare or absent in bacteria.
Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm.
A program called PROTEP is described that permits the rapid comparison of pairs of three-dimensional protein structures to identify the patterns of secondary structure elements that they have in common, using a maximal common subgraph isomorphism algorithm that is based on a clique detection procedure.
An efficient automated computer vision based technique for detection of three dimensional structural motifs in proteins.
The method discovers and ranks every piece of structural similarity between the structures compared, thus allowing the simultaneous detection of real 3-D motifs in different domains, between domains, in active sites, surfaces etc.
The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain.
This work has used information about interatomic distances, bond angles, and other configurational parameters to construct two reasonable hydrogen-bonded helical configurations for the polypeptide chain; it is likely that these configurations constitute an important part of the structure of both fibrous and globular proteins, as well as of syntheticpolypeptides.