Inference of Co-Evolving Site Pairs: an Excellent Predictor of Contact Residue Pairs in Protein 3D structures

  title={Inference of Co-Evolving Site Pairs: an Excellent Predictor of Contact Residue Pairs in Protein 3D structures},
  author={Sanzo Miyazawa},
Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing indirect dependences through other residues within a protein or… 

Figures and Tables from this paper


Direct-coupling analysis of residue coevolution captures native contacts across many protein families
The findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.
Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations?
A new experimental approach to protein structure determination is suggested in which selection of functional mutants after random mutagenesis and analysis of correlated mutations provide sufficient proximity constraints for calculation of the protein fold.
Disentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments
This work adapts a recently developed Bayesian network model into a rigorous procedure for disentangling direct from indirect statistical dependencies, and demonstrates that this method not only successfully accomplishes this task, but also allows contacts with weak statistical dependency to be detected.
Correlated mutations and residue contacts in proteins
A simple and general method is presented to analyze correlations in mutational behavior between different positions in a multiple sequence alignment to predict contact maps for each of 11 protein families and compare the result with the contacts determined by crystallography.
Protein 3D Structure Computed from Evolutionary Sequence Variation
Surprisingly, it is found that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures, and the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.
Coevolving protein residues: maximum likelihood identification and relationship to structure.
A maximum likelihood method is developed and applied that allows for correlations induced by phylogenetic relationships and for variation in rate of evolution along branches, and does not rely on accurate reconstruction of ancestral nodes.
Using information theory to search for co-evolving residues in proteins
The performance of various normalizations of MI in enhancing detection of co-evolving positions was assessed and it was found that normalization by the pair entropy was optimal.
Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction
A rapid, simple and general method based on information theory that accurately estimates the level of background mutual information for each pair of positions in a given protein family, and correctly identifies substantially more coevolving positions in protein families than any existing method.
Structural Constraints on the Covariance Matrix Derived from Multiple Aligned Protein Sequences
New measures to impose constraints that make the contact map more consistent with a three dimensional structure are introduced, including global (bulk) properties and local secondary structure properties.
Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading.
Attractive inter-residue contact energies for proteins have been re-evaluated with the same assumptions and approximations used originally by us in 1985, but with a significantly larger set of