# Estimating the pattern of nucleotide substitution

@article{Yang2004EstimatingTP, title={Estimating the pattern of nucleotide substitution}, author={Ziheng Yang}, journal={Journal of Molecular Evolution}, year={2004}, volume={39}, pages={105-111} }

Knowledge of the pattern of nucleotide substitution is important both to our understanding of molecular sequence evolution and to reliable estimation of phylogenetic relationships. The method of parsimony analysis, which has been used to estimate substitution patterns in real sequences, has serious drawbacks and leads to results difficult to interpret. In this paper a model-based maximum likelihood approach is proposed for estimating substitution patterns in real sequences. Nucleotide…

## 866 Citations

Evolutionary distance estimation under heterogeneous substitution pattern among lineages.

- BiologyMolecular biology and evolution
- 2002

This work presents a simple modification for existing distance estimation methods to relax the assumption of the substitution pattern homogeneity among lineages when analyzing DNA and protein sequences and shows that the modified method performs much better than the LogDet methods, which do not require the homogeneity assumption in estimating the number of substitutions per site.

Estimation of Phylogeny Using a General Markov Model

- BiologyEvolutionary bioinformatics online
- 2007

The non-homogeneous model of nucleotide substitution proposed by Barry and Hartigan is the most general model of DNA evolution assuming an independent and identical process at each site, and the most likely tree under the three models is determined.

Reconstruction of ancestral nucleotide sequences and estimation of substitution frequencies in a star phylogeny.

- BiologyGene
- 2007

Phylogenetic analysis using parsimony and likelihood methods

- BiologyJournal of Molecular Evolution
- 2005

Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter, and its performance relative to that of the likelihood method was especially noted.

The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support.

- BiologyMolecular biology and evolution
- 2002

The relationship between nucleotide substitution model complexity and nonparametric bootstrap support under maximum likelihood (ML) for six data sets for which the true relationships are known with a high degree of certainty and raises several issues regarding the process of model selection.

Maximum-likelihood models for combined analyses of multiple sequence data

- BiologyJournal of Molecular Evolution
- 2006

Models of nucleotide substitution were constructed for combined analyses of heterogeneous sequence data (such as those of multiple genes) from the same set of species. The models account for…

An empirical examination of the utility of codon-substitution models in phylogeny reconstruction.

- BiologySystematic biology
- 2005

Although computational burden makes codon models unfeasible for tree search in large data sets, it is suggested that they may be useful for comparing candidate trees and caution against use of overly complex substitution models.

Evaluation of Ancestral Sequence Reconstruction Methods to Infer Nonstationary Patterns of Nucleotide Substitution

- BiologyGenetics
- 2015

Methods to correct for systematic biases and use computer simulation to evaluate their performance when the substitution process is nonstationary are implemented and it is suggested that the new methods may be useful for studying complex patterns of nucleotide substitution in large genomic data sets.

Should We Use Model-Based Methods for Phylogenetic Inference When We Know That Assumptions About Among-Site Rate Variation and Nucleotide Substitution Pattern Are Violated ?

- Biology
- 2002

These results suggest that application of increasingly general and complex models would sometimes lead to decreased efciency, despite the fact that the more complex models almost always provide signicantly better access to real data than the simpler models.

Statistical comparison of nucleotide, amino acid, and codon substitution models for evolutionary analysis of protein-coding sequences.

- BiologySystematic biology
- 2009

By analyzing divergent and conserved interspecific mammalian sequences and intraspecific human population data, the superiority of the codon substitution models is shown and the advantages and disadvantages of the models of the 3 types are discussed.

## References

SHOWING 1-10 OF 47 REFERENCES

Estimation of evolutionary distances between homologous nucleotide sequences.

- BiologyProceedings of the National Academy of Sciences of the United States of America
- 1981

It is pointed out that the rates of synonymous base substitutions not only are very high but also are roughly equal to each other between genes even when amino acid-altering substitution rates are quite different and that this is consistent with the neutral mutation-random drift hypothesis of molecular evolution.

A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences

- BiologyJournal of Molecular Evolution
- 2005

Some examples were worked out using reported globin sequences to show that synonymous substitutions occur at much higher rates than amino acid-altering substitutions in evolution.

Statistical tests of models of DNA substitution

- BiologyJournal of Molecular Evolution
- 2004

A test statistics suggested by Cox is employed to test the adequacy of some statistical models of DNA sequence evolution used in the phylogenetic inference method introduced by Felsentein.

Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites.

- BiologyMolecular biology and evolution
- 1993

Felsenstein's maximum-likelihood approach for inferring phylogeny from DNA sequences is extended to the case where substitution rates over sites are described by the gamma distribution and a numerical example is presented to show that the method fits the data better than do previous models.

Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotide

- BiologyJournal of Molecular Evolution
- 2005

SummaryA formal mathematical analysis of Kimura's (1981) six-parameter model of nucleotide substitution for the case of unequal substitution rates among different pairs of nucleotides is conducted,…

Patterns of nucleotide substitution in pseudogenes and functional genes

- BiologyJournal of Molecular Evolution
- 2005

The pattern obtained suggests that transition mutations occur somewhat more frequently than transversion mutations and that mutations result more often in A or T than in G or C.

Methods for inferring phylogenies from nucleic acid sequence data by using maximum likelihood and linear invariants.

- BiologyMolecular biology and evolution
- 1991

The method of linear invariants described by Cavender, which includes Lake's method of evolutionary parsimony as a special case, is essentially a form of the likelihood-ratio method, which may be used to determine the feasibility of any tree for which the maximum likelihood can be computed.

Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees.

- BiologyMolecular biology and evolution
- 1993

A new mathematical method for estimating the number of transitional and transversional substitutions per site, as well as the total number of nucleotide substitutions, suggested that the transition/transversion ratio for the entire control region was approximately 15 and nearly the same for the two species.

Maximum likelihood inference of protein phylogeny and the origin of chloroplasts

- BiologyJournal of Molecular Evolution
- 2005

A maximum likelihood method based on a Markov model that takes into account the unequal transition probabilities among pairs of amino acids and does not assume constancy of rate among different lineages is expected to be powerful in inferring phylogeny among distantly related proteins.

A new method for calculating evolutionary substitution rates

- BiologyJournal of Molecular Evolution
- 2005

It is found that the method applies satisfactorily to the three former species, while the last appears to be outside the scope of the present approach.