Mapping the Protein Universe

@article{Holm1996MappingTP,
  title={Mapping the Protein Universe},
  author={Liisa Holm and Chris Sander},
  journal={Science},
  year={1996},
  volume={273},
  pages={595 - 602}
}
The comparison of the three-dimensional shapes of protein molecules poses a complex algorithmic problem. Its solution provides biologists with computational tools to organize the rapidly growing set of thousands of known protein shapes, to identify new types of protein architecture, and to discover unexpected evolutionary relations, reaching back billions of years, between protein molecules. Protein shape comparison also improves tools for identifying gene functions in genome databases by… 
A global representation of the protein fold space
TLDR
A 3D map of the protein fold space in which structurally related folds are represented by spatially adjacent points reveals a high-level organization of the fold space that is intuitively interpretable.
Bridging protein local structures and protein functions
TLDR
The repertoire of methods currently being applied in the field of in silico annotation of protein function based on the accumulation of vast amounts of sequence and structure data is summarized and newly developed structure-based methods are emphasized, which are able to identify locally structural motifs and reveal their relationship with protein functions.
Geometric and topological methods in protein structure analysis
TLDR
This thesis describes efficient computational methods for describing and comparing molecular structures by combining both geometric and topological approaches and describes an efficient algorithm to find promising initial relative placements of the proteins.
Expanding protein universe and its origin from the biological Big Bang
TLDR
This work has discovered that the universe of protein structures is organized hierarchically into a scale-free network and attempts to glance at the very origin of life.
Comparing and modeling protein structure
TLDR
The first part of this work focuses on protein structural alignment, namely, the comparison of two structures and formalizes this problem as the optimization of a geometric similarity score over the space of rigid body transformations, leading to an approximate polynomial time alignment algorithm.
Representation of the Protein Universe using Classifications, Maps, and Networks
TLDR
Different protein qualities were revealed in each study; many point out the uniqueness of domains of the alpha/beta SCOP (structural classification of proteins) class.
The structure of the protein universe and genome evolution
TLDR
These findings suggest that genome evolution is driven by extremely general mechanisms based on the preferential attachment principle, and that protein folds and families encoded in diverse genomes show similar size distributions with notable mathematical properties.
Protein folds: molecular systematics in three dimensions
TLDR
All basic protein folds will likely be determined in the near future, laying the foundation for a comprehensive understanding of the biochemical and cellular functions of whole organisms.
...
...

References

SHOWING 1-10 OF 49 REFERENCES
Detection of common three‐dimensional substructures in proteins
TLDR
A fully automatic algorithm for three‐dimensional alignment of protein structures and for the detection of common substructures and structural repeats is presented, so fast that structural comparison of a single protein with the entire database of known protein structures can be performed routinely on a workstation.
Parser for protein folding units
TLDR
An algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions is proposed, which is useful for the analysis of folding principles, for modular protein design and for protein engineering.
The anatomy and taxonomy of protein structure.
A structural census of the current population of protein sequences.
  • M. Gerstein, M. Levitt
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1997
TLDR
Overall, it is found that an appreciable fraction of the known folds are present in each of the major groups of organisms, and most of the common folds are associated with many families of nonhomologous sequences, although some of the most common folds in vertebrates, such as globins or zinc fingers, are rare or absent in bacteria.
Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm.
TLDR
A program called PROTEP is described that permits the rapid comparison of pairs of three-dimensional protein structures to identify the patterns of secondary structure elements that they have in common, using a maximal common subgraph isomorphism algorithm that is based on a clique detection procedure.
An efficient automated computer vision based technique for detection of three dimensional structural motifs in proteins.
TLDR
The method discovers and ranks every piece of structural similarity between the structures compared, thus allowing the simultaneous detection of real 3-D motifs in different domains, between domains, in active sites, surfaces etc.
The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain.
TLDR
This work has used information about interatomic distances, bond angles, and other configurational parameters to construct two reasonable hydrogen-bonded helical configurations for the polypeptide chain; it is likely that these configurations constitute an important part of the structure of both fibrous and globular proteins, as well as of syntheticpolypeptides.
Are protein folds atypical?
  • H. Li, C. Tang, N. Wingreen
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1998
TLDR
It is argued that the most common folds of proteins are the most atypical in the space of possible structures, namely those far away from other structures in the high dimensional space, have more sequences that fold into them and are thermodynamic more stable.
...
...