Phylogenetic Tree Construction Using Markov Chain Monte Carlo

@article{Li2000PhylogeneticTC,
  title={Phylogenetic Tree Construction Using Markov Chain Monte Carlo},
  author={Shuying S Li and Dennis K. Pearl and Hani Doss},
  journal={Journal of the American Statistical Association},
  year={2000},
  volume={95},
  pages={493 - 508}
}
Abstract We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the observed sequences. Our algorithm strikes a reasonable balance between the desire to move globally through the space of phylogenies and the need to make computationally feasible moves in areas… 
Bayesian phylogenetic inference via Markov chain Monte Carlo methods.
We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a
Limitations of Markov chain Monte Carlo algorithms for Bayesian inference of phylogeny
TLDR
This paper presents the first theoretical work analyzing the rate of convergence of several Markov chains widely used in phylogenetic inference, and proves that many of the popular Markov Chains take exponentially long to reach their stationary distribution.
Markov chain Monte Carlo for the Bayesian analysis of evolutionary trees from aligned molecular sequences
2 We show how to quantify the uncertainty in a phylogenetic tree inferred from molecular sequence information. Given a stochastic model of evolution, the Bayesian solution is simply to form a
Bayesian phylogenetic inference via Monte Carlo methods
TLDR
The combinatorial sequential Monte Carlo (CSMC) method is proposed to generalize applications of SMC to non-clock tree inference based on the existence of a flexible partially ordered set (poset) structure, and it is presented in a level of generality directly applicable to many other combinatorsial spaces.
Markov chain Monte Carlo and its applications to phylogenetic tree construction
TLDR
This thesis forms a novel Bayesian model for phylogenetic tree construction based on recent studies that incorporates known information about the evolutionary history of the species, referred to as the species phylogeny, in a statistically rigorous way and develops an inference algorithm based on a Markov chain Monte Carlo method in order to overcome the computational complexity inherent in the problem.
Phylogenetic MCMC Algorithms Are Misleading on Mixtures of Trees
TLDR
It is proved that the Markov chains take an exponentially long number of iterations to converge to the posterior distribution, which means that in cases of data containing potentially conflicting phylogenetic signals, phylogenetic reconstruction should be performed separately on each signal.
Guided tree topology proposals for Bayesian phylogenetic inference.
TLDR
This work investigates the performance of common MCMC proposal distributions in terms of median and variance of run time to convergence on 11 data sets, and introduces two new Metropolized Gibbs Samplers for moving through "tree space".
Parallel algorithms for Bayesian phylogenetic inference
TLDR
This paper describes parallel algorithms and their MPI-based parallel implementation for MCMC-based Bayesian phylogenetic inference and identifies a number of important points, including a superlinear speedup due to more effective cache usage and the point at which additional processors slow down the process due to communication overhead.
Bayesian selection of continuous-time Markov chain evolutionary models.
TLDR
A reversible jump Markov chain Monte Carlo approach to estimating the posterior distribution of phylogenies based on aligned DNA/RNA sequences under several hierarchical evolutionary models is developed and found that the Kimura model is too restrictive, and the Hasegawa, Kishino, and Yano model can be rejected for some data sets.
Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.
TLDR
This work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chains takes to converge.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 87 REFERENCES
Bayesian phylogenetic inference via Markov chain Monte Carlo methods.
We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a
Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method.
TLDR
An improved Bayesian method is presented for estimating phylogenetic trees using DNA sequence data, and the posterior probabilities of phylogenies are used to estimate the maximum posterior probability (MAP) tree, which has a probability of approximately 95%.
Markov Chasin Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees
We further develop the Bayesian framework for analyzing aligned nucleotide sequence data to reconstruct phylogenies, assess uncertainty in the reconstructions, and perform other statistical
Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I. Four taxa with a molecular clock.
TLDR
Bootstrapping is a conservative approach for estimating the reliability of an inferred phylogeny for four taxa by using model trees of three taxa with an outgroup and by assuming a constant rate of nucleotide substitution.
A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data.
  • P. Lewis
  • Biology, Medicine
    Molecular biology and evolution
  • 1998
TLDR
The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.
Stochastic search strategy for estimation of maximum likelihood phylogenetic trees.
TLDR
A stochastic search strategy for estimation of the ML tree that is based on a simulated annealing algorithm that is less likely to become trapped in local optima than are existing algorithms for ML tree estimation.
Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites.
  • Z. Yang
  • Biology, Medicine
    Molecular biology and evolution
  • 1993
TLDR
Felsenstein's maximum-likelihood approach for inferring phylogeny from DNA sequences is extended to the case where substitution rates over sites are described by the gamma distribution and a numerical example is presented to show that the method fits the data better than do previous models.
Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters.
  • J. Chang
  • Mathematics, Medicine
    Mathematical biosciences
  • 1996
TLDR
Examples are given showing that distance and maximum likelihood methods for topology estimation have been shown to be consistent under the homogeneity assumption, and that these methods can fail to be consistency when the homogeneous assumption is relaxed.
Practical Markov Chain Monte Carlo
TLDR
The case is made for basing all inference on one long run of the Markov chain and estimating the Monte Carlo error by standard nonparametric methods well-known in the time-series and operations research literature.
Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling.
We present a new way to make a maximum likelihood estimate of the parameter 4N mu (effective population size times mutation rate per site, or theta) based on a population sample of molecular
...
1
2
3
4
5
...