Capturing coevolutionary signals inrepeat proteins

@article{Espada2015CapturingCS,
  title={Capturing coevolutionary signals inrepeat proteins},
  author={Roc{\'i}o Espada and R. G. Parra and Thierry Mora and Aleksandra M. Walczak and Diego U. Ferreiro},
  journal={BMC Bioinformatics},
  year={2015},
  volume={16}
}
BackgroundThe analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts – portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins – natural systems for which the identification of folding domains remains challenging.ResultsWe show that the inherent translational symmetry of repeat protein sequences introduces a… 
Patterns of coevolving amino acids unveil structural and dynamical domains
TLDR
Even large-scale structural and functionally related properties can be recovered from inference methods applied to evolutionary-related sequences, as shown in the context of the native structure.
Inferring repeat-protein energetics from evolutionary information
TLDR
This work traces the variations in the energetic scores of natural proteins and relates them to their experimental characterization, and proposes a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry.
Origins of coevolution between residues distant in protein 3D structures
TLDR
Overall, the results suggest that directly coevolving residue pairs not in repeat proteins are spatially proximal in at least one biologically relevant protein conformation within the family; there is little evidence for direct coupling between residues at spatially separated allosteric and functional sites or for increasedDirect coupling between residue pairs on putativeAllosteric pathways connecting them.
Accurate contact-based modelling of repeat proteins predicts the structure of Curlin and SPW repeats
TLDR
It is shown that using the deep learning-based PconsC4 is more effective for predicting both intra and interunit contacts among a comprehensive set of repeat proteins.
Bayesian statistical approach for protein residue-residue contact prediction
TLDR
This work presents two different approaches for addressing the limitations of contact prediction methods, including a Bayesian statistical approach that provides posterior probability estimates for residue-residue contacts and eradicates the use of heuristics.
Structural and Energetic Characterization of the Ankyrin Repeat Protein Family
TLDR
A strong linear correlation between the conservation of the energetic features in the repeat arrays and their sequence variations is found, and new insights are discussed into the organization and function of these ubiquitous proteins.
Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes
TLDR
A Potts model for TCS is constructed that can quantitatively predict how mutating amino acid identities affect the interaction between TCS partners and non-partners and finds that the best predictions accurately reproduce the amino acid combinations found in experiment, which enable functional signaling with its partner PhoP.
Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families
TLDR
It is shown that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins.
Size and structure of the sequence space of repeat proteins
TLDR
It is shown that the coding space of a given protein family —the total number of sequences in that family— can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences.
...
...

References

SHOWING 1-10 OF 48 REFERENCES
Direct-coupling analysis of residue coevolution captures native contacts across many protein families
TLDR
The findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.
Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models.
TLDR
The pseudolikelihood method, applied to 21-state Potts models describing the statistical properties of families of evolutionarily related proteins, significantly outperforms existing approaches to the direct-coupling analysis, the latter being based on standard mean-field techniques.
Coevolutionary signals across protein lineages help capture multiple protein conformations
TLDR
A signature of functionally important states in several protein families is revealed, using direct coupling analysis, which detects residue pair coevolution of protein sequence composition, and this signature is exploited in a protein structure-based model to uncover conformational diversity, including hidden functional configurations.
Direct coupling analysis for protein contact prediction.
TLDR
Direct Coupling Analysis has been shown to produce highly accurate estimates of amino-acid pairs that have direct reciprocal constraints in evolution and instructions and protocols on how to use the algorithmic implementations of DCA starting from data extraction to predicted-contact visualization in contact maps or representative protein structures are provided.
How frequent are correlated changes in families of protein sequences?
  • E. Neher
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1994
TLDR
A statistical theory is presented which allows evaluation of correlations in a family of aligned protein sequences by assigning a scalar metric to each type of amino acid and calculating correlation coefficients of these quantities at different positions and it is found that there is a high correlation between fluctuations in neighboring charges.
PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments
TLDR
A novel method, PSICOV, is presented, which introduces the use of sparse inverse covariance estimation to the problem of protein contact prediction and displays a mean precision substantially better than the best performing normalized mutual information approach and Bayesian networks.
The network of stabilizing contacts in proteins studied by coevolutionary data.
TLDR
The algorithm is optimized to calculate effective energies between the residues, validating the approach both back-calculating interaction energies in a model system, and predicting the free energies associated to mutations in real systems.
Identification of direct residue contacts in protein–protein interaction by message passing
TLDR
This work has developed a method that combines covariance analysis with global inference analysis and successfully and robustly identified residue pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor kinase and response regulator proteins and for homointer interactions between RR proteins.
Detecting Repetitions and Periodicities in Proteins by Tiling the Structural Space
TLDR
By an exhaustive analysis of the distribution of structural repeats using a robust metric, this work defines those portions of a protein molecule that best describe the overall structure as a tessellation of basic units, and can identify structural units that can be encoded by a variety of distinct amino acid sequences.
Improved Contact Predictions Using the Recognition of Protein Like Contact Patterns
TLDR
PconsC2 is a novel method that uses a deep learning approach to identify protein-like contact patterns to improve contact predictions and is superior to earlier methods based on statistical inferences in comparison to state of the art methods using machine learning.
...
...