# Phylogenetic Tree Construction Using Markov Chain Monte Carlo

@article{Li2000PhylogeneticTC, title={Phylogenetic Tree Construction Using Markov Chain Monte Carlo}, author={Shuying S Li and Dennis K. Pearl and Hani Doss}, journal={Journal of the American Statistical Association}, year={2000}, volume={95}, pages={493 - 508} }

Abstract We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the observed sequences. Our algorithm strikes a reasonable balance between the desire to move globally through the space of phylogenies and the need to make computationally feasible moves in areas…

## 268 Citations

Bayesian phylogenetic inference via Markov chain Monte Carlo methods.

- Mathematics, MedicineBiometrics
- 1999

We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a…

Limitations of Markov chain Monte Carlo algorithms for Bayesian inference of phylogeny

- Biology, Mathematics
- 2006

This paper presents the first theoretical work analyzing the rate of convergence of several Markov chains widely used in phylogenetic inference, and proves that many of the popular Markov Chains take exponentially long to reach their stationary distribution.

Markov chain Monte Carlo for the Bayesian analysis of evolutionary trees from aligned molecular sequences

- Mathematics
- 1999

2 We show how to quantify the uncertainty in a phylogenetic tree inferred from molecular sequence information. Given a stochastic model of evolution, the Bayesian solution is simply to form a…

Bayesian phylogenetic inference via Monte Carlo methods

- Computer Science
- 2012

The combinatorial sequential Monte Carlo (CSMC) method is proposed to generalize applications of SMC to non-clock tree inference based on the existence of a flexible partially ordered set (poset) structure, and it is presented in a level of generality directly applicable to many other combinatorsial spaces.

Markov chain Monte Carlo and its applications to phylogenetic tree construction

- Computer Science
- 2007

This thesis forms a novel Bayesian model for phylogenetic tree construction based on recent studies that incorporates known information about the evolutionary history of the species, referred to as the species phylogeny, in a statistically rigorous way and develops an inference algorithm based on a Markov chain Monte Carlo method in order to overcome the computational complexity inherent in the problem.

Phylogenetic MCMC Algorithms Are Misleading on Mixtures of Trees

- Mathematics, MedicineScience
- 2005

It is proved that the Markov chains take an exponentially long number of iterations to converge to the posterior distribution, which means that in cases of data containing potentially conflicting phylogenetic signals, phylogenetic reconstruction should be performed separately on each signal.

Guided tree topology proposals for Bayesian phylogenetic inference.

- Mathematics, MedicineSystematic biology
- 2012

This work investigates the performance of common MCMC proposal distributions in terms of median and variance of run time to convergence on 11 data sets, and introduces two new Metropolized Gibbs Samplers for moving through "tree space".

Parallel algorithms for Bayesian phylogenetic inference

- Computer ScienceJ. Parallel Distributed Comput.
- 2003

This paper describes parallel algorithms and their MPI-based parallel implementation for MCMC-based Bayesian phylogenetic inference and identifies a number of important points, including a superlinear speedup due to more effective cache usage and the point at which additional processors slow down the process due to communication overhead.

Bayesian selection of continuous-time Markov chain evolutionary models.

- Biology, MedicineMolecular biology and evolution
- 2001

A reversible jump Markov chain Monte Carlo approach to estimating the posterior distribution of phylogenies based on aligned DNA/RNA sequences under several hierarchical evolutionary models is developed and found that the Kimura model is too restrictive, and the Hasegawa, Kishino, and Yano model can be rejected for some data sets.

Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.

- Mathematics, MedicineMathematical biosciences
- 2015

This work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chains takes to converge.

## References

SHOWING 1-10 OF 87 REFERENCES

Bayesian phylogenetic inference via Markov chain Monte Carlo methods.

- Mathematics, MedicineBiometrics
- 1999

We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a…

Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method.

- Medicine, BiologyMolecular biology and evolution
- 1997

An improved Bayesian method is presented for estimating phylogenetic trees using DNA sequence data, and the posterior probabilities of phylogenies are used to estimate the maximum posterior probability (MAP) tree, which has a probability of approximately 95%.

Markov Chasin Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees

- Biology
- 1999

We further develop the Bayesian framework for analyzing aligned nucleotide sequence data to reconstruct phylogenies, assess uncertainty in the reconstructions, and perform other statistical…

Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I. Four taxa with a molecular clock.

- Biology, MedicineMolecular biology and evolution
- 1992

Bootstrapping is a conservative approach for estimating the reliability of an inferred phylogeny for four taxa by using model trees of three taxa with an outgroup and by assuming a constant rate of nucleotide substitution.

A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data.

- Biology, MedicineMolecular biology and evolution
- 1998

The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.

Stochastic search strategy for estimation of maximum likelihood phylogenetic trees.

- Mathematics, MedicineSystematic biology
- 2001

A stochastic search strategy for estimation of the ML tree that is based on a simulated annealing algorithm that is less likely to become trapped in local optima than are existing algorithms for ML tree estimation.

Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites.

- Biology, MedicineMolecular biology and evolution
- 1993

Felsenstein's maximum-likelihood approach for inferring phylogeny from DNA sequences is extended to the case where substitution rates over sites are described by the gamma distribution and a numerical example is presented to show that the method fits the data better than do previous models.

Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters.

- Mathematics, MedicineMathematical biosciences
- 1996

Examples are given showing that distance and maximum likelihood methods for topology estimation have been shown to be consistent under the homogeneity assumption, and that these methods can fail to be consistency when the homogeneous assumption is relaxed.

Practical Markov Chain Monte Carlo

- Computer Science
- 1992

The case is made for basing all inference on one long run of the Markov chain and estimating the Monte Carlo error by standard nonparametric methods well-known in the time-series and operations research literature.

Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling.

- Medicine, BiologyGenetics
- 1995

We present a new way to make a maximum likelihood estimate of the parameter 4N mu (effective population size times mutation rate per site, or theta) based on a population sample of molecular…