Inference of population history using coalescent HMMs: review and outlook.

@article{Spence2018InferenceOP,
  title={Inference of population history using coalescent HMMs: review and outlook.},
  author={Jeffrey P. Spence and Matthias Steinr{\"u}cken and Jonathan Terhorst and Yun S. Song},
  journal={Current opinion in genetics \& development},
  year={2018},
  volume={53},
  pages={
          70-76
        }
}
Studying how diverse human populations are related is of historical and anthropological interest, in addition to providing a realistic null model for testing for signatures of natural selection or disease associations. Furthermore, understanding the demographic histories of other species is playing an increasingly important role in conservation genetics. A number of statistical methods have been developed to infer population demographic histories using whole-genome sequence data, with recent… 
Inferring Population Size Histories using Coalescent Hidden Markov Models with TMRCA and Total Branch Length as Hidden States
TLDR
The method CHIMP (CHMM History-Inference ML Procedure), a novel CHMM method for inferring the size history of a population, is presented and is agnostic to the phasing of the data, which makes it a promising alternative in scenarios where high quality data is not available and has potential applications for pseudo-haploid data.
A practical introduction to sequentially Markovian coalescent methods for estimating demographic history from genomic data
TLDR
This practical review explains some of the key concepts underpinning the pairwise and multiple sequentially Markovian coalescent methods (PSMC and MSMC), and explains how the choice of different parameter values by the user can affect the accuracy and precision of the inferences.
Limits and convergence properties of the sequentially Markovian coalescent
TLDR
A tool inferring the best case convergence of SMC methods assuming the true underlying coalescent genealogies are known, and a new interpretation ofSMC methods by highlighting the importance of the transition matrix, which is argued can be used as a set of summary statistics in other statistical inference methods.
Inference of complex population histories using whole-genome sequences from multiple populations
TLDR
DiCal2, an efficient, flexible statistical method that can use whole-genome sequence data from multiple populations to infer complex demographic models involving population size changes, population splits, admixture, and migration, is presented.
Limits and Convergence properties of the Sequentially Markovian Coalescent
TLDR
The limits of this methodology are explored and a tool is presented that can be used to help users quantify what information can be confidently retrieved from given datasets to study the consequences for inference accuracy violating the hypotheses and the assumptions of SMC approaches.
Fine-Scale Inference of Ancestry Segments Without Prior Knowledge of Admixing Groups
TLDR
An algorithm for inferring ancestry segments and characterizing admixture events, which involve an arbitrary number of genetically differentiated groups coming together, which allows inference of the demographic history of the species, properties of admixing groups, identification of signatures of natural selection, and may aid disease gene mapping.
Demographic inference
TLDR
The next generation sequencing revolution has multiplied the amount of genetic data for many organisms by orders of magnitude, and new genomic data can be used to infer the past demographic history of populations.
Exact decoding of the sequentially Markov coalescent
TLDR
This work derives fast, exact methods for frequentist and Bayesian inference using SMC, which requires minimal user intervention or parameter tuning, no numerical optimization or E-M, and is faster and more accurate.
EXACT DECODING OF THE SEQUENTIALLY MARKOV COALESCENT
In statistical genetics, the sequentially Markov coalescent (SMC) is an important framework for approximating the distribution of genetic variation data under complex evolutionary models. Methods
The effect of undetected recombination on genealogy sampling and inference under an isolation‐with‐migration model
TLDR
It was found that the details of sampling intervals that pass a four‐gamete filter have a moderate effect, and that schemes that use the longest intervals, or that use overlapping intervals, gave poorer results, suggesting that filtering based on the four-gamete criterion, while necessary for methods like these, leads to reduced resolution on migration.
...
1
2
3
4
...

References

SHOWING 1-10 OF 94 REFERENCES
Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
TLDR
Combining the demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).
Exact limits of inference in coalescent models
TLDR
This work considers the problem of recovering the true population size history from two possible alternatives on the basis of coalescent time data and improves upon previous results by giving exact expressions for the probability of correctly distinguishing between the two hypotheses as a function of the separation between the alternative size histories.
Estimating Variable Effective Population Sizes from Multiple Genomes: A Sequentially Markov Conditional Sampling Distribution Approach
TLDR
This work generalizes the recently developed sequentially Markov conditional sampling distribution framework, which provides an accurate approximation of the probability of observing a newly sampled haplotype given a set of previously sampled haplotypes.
Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
TLDR
It is found that several of the demographic histories inferred by the whole genome-based methods do not predict the genome-wide distribution of heterozygosity, nor do they predict the empirical SFS, which indicates that demographic inference from a small number of genomes should be interpreted cautiously.
Inference of complex population histories using whole-genome sequences from multiple populations
TLDR
DiCal2, an efficient, flexible statistical method that can utilize whole-genome sequence data from multiple populations to infer complex demographic models involving population size changes, population splits, admixture, and migration, finds that the population ancestral to Australians and Papuans started separating from East Asians and Europeans about 100,000 years ago.
Effects of Linked Selective Sweeps on Demographic Inference and Model Selection
TLDR
It is argued that natural populations may experience the amount of recent positive selection required to skew inferences, and results suggest that demographic studies conducted in many species to date may have exaggerated the extent and frequency of population size changes.
Inference of historical migration rates via haplotype sharing
TLDR
Pairs of individuals from a study cohort will often share long-range haplotypes identical-by-descent, and they can now be efficiently detected in high-resolution genomic datasets, providing a novel source of information in several domains of genetic analysis.
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.
TLDR
This work develops a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies, which is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum.
Inferring the Joint Demographic History of Multiple Populations: Beyond the Diffusion Approximation
TLDR
A tractable model of ordinary differential equations for the evolution of allele frequencies that is closely related to the diffusion approximation but avoids many of its limitations and approximations is proposed.
Inferring Demographic History from a Spectrum of Shared Haplotype Lengths
TLDR
The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids, and infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.
...
1
2
3
4
5
...