Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies.
@article{Halpern1998EvolutionaryDF, title={Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies.}, author={Anne L. Halpern and William J. Bruno}, journal={Molecular biology and evolution}, year={1998}, volume={15 7}, pages={ 910-7 } }
Estimation of evolutionary distances from coding sequences must take into account protein-level selection to avoid relative underestimation of longer evolutionary distances. Current modeling of selection via site-to-site rate heterogeneity generally neglects another aspect of selection, namely position-specific amino acid frequencies. These frequencies determine the maximum dissimilarity expected for highly diverged but functionally and structurally conserved sequences, and hence are crucial…
Figures from this paper
295 Citations
Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates
- BiologyPeerJ
- 2017
Codon-level and amino-acid-level analysis frameworks are directly comparable and yield very similar inferences and the relationship between Rate4Site and dN∕dS is elucidated.
Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles
- BiologyProceedings of the National Academy of Sciences
- 2010
A probabilistic model is proposed that accounts for the heterogeneity of amino acid fitness profiles across the coding positions of a gene and is applied to a dozen real protein-coding gene alignments and finds it to produce biologically plausible inferences.
Detecting Adaptation in Protein-Coding Genes Using a Bayesian Site-Heterogeneous Mutation-Selection Codon Substitution Model
- BiologyMolecular biology and evolution
- 2017
The use of a mutation–selection framework that includes a Dirichlet process approach to account for across-codon-site variation in amino acid fitness profiles as a null model for the detection of adaptation is studied.
Modeling site-specific amino-acid preferences deepens phylogenetic estimates of viral sequence divergence
- BiologyVirus evolution
- 2018
It is found that models informed by experimentally measured site-specific amino-acid preferences estimate longer deep branches on phylogenies of influenza virus hemagglutinin, underscores the importance of modeling site- specific amino- acid preferences when estimating deep divergence times—but shows the inherent limitations of approaches that fail to account for how these preferences shift over time.
Modeling site-specific amino-acid preferences deepens phylogenetic estimates of viral divergence
- BiologybioRxiv
- 2018
This work underscores the importance of modeling site-specific amino-acid preferences when estimating deep divergence times—but also shows the inherent limitations of approaches that fail to account for how these preferences shift over time.
Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach
- BiologybioRxiv
- 2018
A new phylogenetic approach SelAC (Selection on Amino acids and Codons), whose substitution rates are based on a nested model linking protein expression to population genetics, indicates there is great potential for more accurate inference of phylogenetic trees and branch lengths from already existing data through the use of nested, mechanistic models.
Theory of measurement for site-specific evolutionary rates in amino-acid sequences
- BiologybioRxiv
- 2018
This work develops a theory of measurement for site-specific evolutionary rates, by analytically solving the maximum-likelihood equations for rate inference performed on sequences evolved under a mutation–selection model and uses misspecification as a deliberate strategy to result in robust and meaningful parameter inference.
Site-Specific Amino Acid Preferences Are Mostly Conserved in Two Closely Related Protein Homologs
- BiologyMolecular biology and evolution
- 2015
It is found that site-specific evolutionary models informed by the experiments greatly outperformed nonsite-specific alternatives in fitting phylogenies of nucleoproteins from human, swine, equine, and avian influenza.
Selective Constraints on Amino Acids Estimated by a Mechanistic Codon Substitution Model with Multiple Nucleotide Changes
- BiologyPloS one
- 2011
A codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene, is developed, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences.
An Improved Codon Modeling Approach for Accurate Estimation of the Mutation Bias
- BiologybioRxiv
- 2021
An improved codon modeling approach where the fixation rate is not seen as a scalar anymore, but as a tensor unfolding along multiple directions, which gives an accurate representation of how mutation and selection oppose each other at equilibrium.
References
SHOWING 1-10 OF 21 REFERENCES
A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome.
- BiologyMolecular biology and evolution
- 1994
Simulations help confirm previous suggestions that silent sites are saturated, leaving no evidence of heterogeneity in synonymous substitution rates, and confirm previous findings that substitution rates in the chloroplast genome are subject to both lineage-specific and locus-specific effects.
A codon-based model of nucleotide substitution for protein-coding DNA sequences.
- BiologyMolecular biology and evolution
- 1994
Analyses of two data sets suggest that the new codon-based model can provide a better fit to data than can nucleotide-based models and can produce more reliable estimates of certain biologically important measures such as the transition/transversion rate ratio and the synonymous/nonsynonymous substitution rate ratio.
Codon substitution in evolution and the "saturation" of synonymous changes.
- BiologyGenetics
- 1983
A mathematical model for codon substitution is presented, taking into account unequal mutation rates among different nucleotides and purifying selection, and it is shown that, when the mutation rates are not equal, the estimate of synonymous substitutions obtained by Perler et al. increases nonlinearly, although the true number of synonymous substitution increases linearly.
Estimation of Reversible Substitution Matrices from Multiple Pairs of Sequences
- Computer ScienceJournal of Molecular Evolution
- 1997
A weighting method for pairs of taxa related by a known tree that results in uniform weights for all branches and resembles one obtained using maximum likelihood, and the resulting distance measure is shown to have better linearity than is obtained in a less general model.
Using substitution probabilities to improve position-specific scoring matrices
- Computer ScienceComput. Appl. Biosci.
- 1996
This work introduces a simple method for computing pseudo-counts that combines the diversity observed in each alignment position with amino acid substitution probabilities and was a substantial improvement over the traditional average score method used for constructing profiles.
A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data.
- BiologyMolecular biology and evolution
- 1998
The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.
Combining protein evolution and secondary structure.
- BiologyMolecular biology and evolution
- 1996
An evolutionary model that combines protein secondary structure and amino acid replacement is introduced. It allows likelihood analysis of aligned protein sequences and does not require the…
Amino acid substitution matrices from protein blocks.
- BiologyProceedings of the National Academy of Sciences of the United States of America
- 1992
This work has derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins, leading to marked improvements in alignments and in searches using queries from each of the groups.
A Hidden Markov Model approach to variation among sites in rate of evolution.
- BiologyMolecular biology and evolution
- 1996
The method of Hidden Markov Models is used to allow for unequal and unknown evolutionary rates at different sites in molecular sequences and it is shown how to use the Newton-Raphson method to estimate branch lengths of a phylogeny and to infer from a phylogenies what assignment of rates to sites has the largest posterior probability.
Hidden Markov models in computational biology. Applications to protein modeling.
- Biology, Computer ScienceJournal of molecular biology
- 1994
The results suggest the presence of an EF-hand calcium binding motif in a highly conserved and evolutionary preserved putative intracellular region of 155 residues in the alpha-1 subunit of L-type calcium channels which play an important role in excitation-contraction coupling.