The SMC′ Is a Highly Accurate Approximation to the Ancestral Recombination Graph

@article{Wilton2015TheSI,
  title={The SMC′ Is a Highly Accurate Approximation to the Ancestral Recombination Graph},
  author={Peter R Wilton and Shai Carmi and Asger Hobolth},
  journal={Genetics},
  year={2015},
  volume={200},
  pages={343 - 355}
}
Two sequentially Markov coalescent models (SMC and SMC′) are available as tractable approximations to the ancestral recombination graph (ARG). We present a Markov process describing coalescence at two fixed points along a pair of sequences evolving under the SMC′. Using our Markov process, we derive a number of new quantities related to the pairwise SMC′, thereby analytically quantifying for the first time the similarity between the SMC′ and the ARG. We use our process to show that the joint… 

Figures and Topics from this paper

The distribution of waiting distances in ancestral recombination graphs.
TLDR
An analytic expression is derived for the distribution of waiting distances between tree changes under the sequentially Markovian coalescent model and an accurate approximation to the distributionof waiting distances for topology changes is obtained.
The distribution of waiting distances in ancestral recombination graphs and its applications
TLDR
An analytic expression is derived for the distribution of waiting distances between tree changes under the sequentially Markovian coalescent model and an accurate approximation to the distributionof waiting distances for topology changes is obtained.
Bayesian Nonparametric Inference of Population Size Changes from Sequential Genealogies
TLDR
A Gaussian process-based Bayesian nonparametric method coupled with a sequentially Markov coalescent model that allows accurate inference of population sizes over time from a set of genealogies and outperforms recent likelihood-based methods that rely on discretization of the parameter space.
Inferring Population Genetic Parameters: Particle Filtering, HMM, Ripley's K-Function or Runs of Homozygosity?
TLDR
A new tool is described, RECJumper, for inference in the Markov approximated coalescent model, finding that choosing an appropriate proposal distribution is crucial to obtain satisfactory behaviour in particle filtering, and tree space discretisation in HMM-methodology is non-trivial and the choice can influence the results.
Inference of recombination maps from a single pair of genomes and its application to archaic samples
TLDR
iSMC accurately infers recombination maps under a wide range of scenarios – remarkably, even from a single pair of unphased genomes, and reports that the evolution of the recombination landscape follows the established phylogeny of Neandertals, Denisovans and modern human populations.
Inference of recombination maps from a single pair of genomes and its application to ancient samples
TLDR
The sequentially Markovian coalescent model is extended to jointly infer demography and the spatial variation in recombination rate and it is demonstrated that iSMC accurately infers recombination maps under a wide range of scenarios–remarkably, even from a single pair of unphased genomes.
Frontiers in Coalescent Theory: Pedigrees, Identity-by-Descent, and Sequentially Markov Coalescent Models
TLDR
This dissertation develops a coalescent hidden Markov model approach to inferring the demographic and reproductive history of a triploid asexual lineage derived from a diploid sexual ancestor of the New Zealand snail Potamopyrgus antipodarum.
High-throughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability
TLDR
A new method, ASMC, that can estimate coalescence times using only SNP array data, and is 2-4 orders of magnitude faster than previous methods when sequencing data are available, was developed and applied to sequencing data from 498 Dutch individuals to detect background selection at deeper time scales.
Robust and scalable inference of population history from hundreds of unphased whole genomes
TLDR
SMC++ is presented, a new statistical tool capable of analyzing orders of magnitude more samples than existing methods while requiring only unphased genomes and employing a novel spline regularization scheme that greatly reduces estimation error.
A non-zero variance of Tajima’s estimator for two sequences even for infinitely many unlinked loci
TLDR
The population-scaled mutation rate, θ, is informative on the effective population size and is thus widely used in population genetics, but Tajima’s estimator, which is the average number of pairwise differences, is not consistent and therefore its variance does not vanish even as n → ∞.
...
1
2
3
...

References

SHOWING 1-10 OF 61 REFERENCES
Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach
TLDR
It is shown that the patterns of ILS along a sequence alignment can be recovered efficiently together with the ancestral recombination rate, and an extension of the basic model is introduced that allows for mutation rate heterogeneity and reanalyze human–chimpanzee–gorilla–orangutan alignments, using the new models.
Markovian approximation to the finite loci coalescent with recombination along multiple sequences.
TLDR
A natural Markovian approximation is formulated for the tree building process along the sequences, and simple and analytically tractable formulae for the distribution of the tree at the next locus conditioned on the trees at the present locus are derived.
Approximating the coalescent with recombination
  • G. McVean, Niall J Cardin
  • Biology, Medicine
    Philosophical Transactions of the Royal Society B: Biological Sciences
  • 2005
TLDR
This work introduces a simplification of the coalescent process in which coalescence between lineages with no overlapping ancestral material is banned and the resulting process has a simple Markovian structure when generating genealogies sequentially along a sequence, yet has very similar properties to the full model.
A renewal theory approach to IBD sharing.
TLDR
This work describes a general framework for the IBD process along the chromosome under the Markovian models (SMC/SMC'), as well as introduce and justify a new model, which is term the renewal approximation, under which lengths of successive segments are independent.
Genome-Wide Inference of Ancestral Recombination Graphs
TLDR
A new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes and which converges rapidly to the posterior distribution over ARGs and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations.
Bayesian Inference of Local Trees Along Chromosomes by the Sequential Markov Coalescent
TLDR
The model underlying SMARTree is an approximation to the full recombinant-coalescent distribution, and in a small trial on simulated data, recovery of local trees was similar to that of LAMARC, a sampler which uses the full model.
A Markov Chain Model of Coalescence with Recombination
TLDR
This work considers a coalescent process model with recombination, as described by Hudson (1983; 1990), for two loci and a sample size of two sequences, and describes an algorithm for simulating the tree building process.
Sequential Markov coalescent algorithms for population models with demographic structure.
TLDR
It is found that the sequential Markov coalescent method approximates the coalescent well in general in models with demographic structure, an exception is the case where individuals are sampled from populations separated by reduced gene flow, and the correlations may be significantly underestimated.
Recombination as a point process along sequences.
TLDR
This work simulates the history of sequences in the coalescent model with recombination going back in time, encountering recombinations and coalescence until the ancestral material is located on one sequence for homologous positions in the present sequences.
Genomic Relationships and Speciation Times of Human, Chimpanzee, and Gorilla Inferred from a Coalescent Hidden Markov Model
TLDR
A hidden Markov model that incorporates genealogical variation and relates the model parameters to population genetics quantities such as speciation times and ancestral population sizes finds that the rate of transitions between different genealogies correlates well with the region-wide present-day human recombination rate, but does not correlate with the fine-scale recombination rates and recombination hot spots.
...
1
2
3
4
5
...