Can the Site-Frequency Spectrum Distinguish Exponential Population Growth from Multiple-Merger Coalescents?

  title={Can the Site-Frequency Spectrum Distinguish Exponential Population Growth from Multiple-Merger Coalescents?},
  author={Bjarki Eldon and Matthias C. F. Birkner and Jochen Blath and Fabian Freund},
  pages={841 - 856}
The ability of the site-frequency spectrum (SFS) to reflect the particularities of gene genealogies exhibiting multiple mergers of ancestral lines as opposed to those obtained in the presence of population growth is our focus. An excess of singletons is a well-known characteristic of both population growth and multiple mergers. Other aspects of the SFS, in particular, the weight of the right tail, are, however, affected in specific ways by the two model classes. Using an approximate likelihood… 
Distinguishing multiple-merger from Kingman coalescence using two-site frequency spectra
A new method is presented based on the pointwise mutual information of the two-site frequency spectrum for pairs of linked sites that can detect when the genome-wide genetic diversity is inconsistent with the Kingman coalescent, rather than detecting outlier regions, as in selection scan methods.
Multi-locus data distinguishes between population growth and multiple merger coalescents
  • Jere Koskela
  • Computer Science
    Statistical applications in genetics and molecular biology
  • 2018
Abstract We introduce a low dimensional function of the site frequency spectrum that is tailor-made for distinguishing coalescent models with multiple mergers from Kingman coalescent models with
Non-parametric estimation of population size changes from the site frequency spectrum
This work presents a new method, CubSFS, for estimating the changes in population size of a panmictic population from an observed SFS, and provides a straightforward proof for the expression of the expected site frequency spectrum depending only on the population size.
The Site Frequency Spectrum for General Coalescents
This work derives a new formula for the expected SFS for general Λ- and Ξ-coalescents, which leads to an efficient algorithm and obtains general theoretical results for the identifiability of the Λ measure when ζ is a constant function, as well as for the identity of the function ζ under a fixed Ξ measure.
The site-frequency spectrum associated with Ξ-coalescents.
The site-frequency spectrum associated with Ξ-coalescents
Recursions for the expected site-frequency spectrum associated with so-called Xi-coalescents, that is exchangeable coalescents which admit simultaneous multiple mergers of ancestral lineages, are given, and it is suggested that for autosomal population genetic data from diploid or polyploid highly fecund populations who may have skewed offspring distributions, one should not apply Lambda-coaledents, but Xi-cents.
Inference of Super-exponential Human Population Growth via Efficient Computation of the Site Frequency Spectrum for Generalized Models
Applying the inference framework to data from the NHLBI Exome Sequencing Project, it is found that a model with a generalized growth epoch fits the observed SFS significantly better than the equivalent model with exponential growth (P-value =3.85×10−6).
External branch lengths of Λ-coalescents without a dust component
Λ -coalescents model genealogies of samples of individuals from a large population by means of a family tree. The tree’s leaves represent the individuals, and the lengths of the adjacent edges
Coalescent Processes with Skewed Offspring Distributions and Nonequilibrium Demography
An extended Moran model with exponential population growth is developed, and it is demonstrated that the underlying ancestral process converges to a time-inhomogeneous psi-coalescent, and both can be estimated accurately from whole-genome data.


Statistical properties of the site-frequency spectrum associated with Lambda-coalescents
Statistical properties of the site frequency spectrum associated with Lambda-coalescents are our objects of study. In particular, we derive recursions for the expected value, variance, and covariance
Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model
It is argued that within the (vast) family of Λ-coalescents, the parametrisable sub-family of Beta(2 − α, α)-coalesCents, where α ∈ (1, 2], are of particular relevance and obtained a method to compute (approximate) likelihood surfaces for the observed type probabilities of a given sample.
Genealogies of rapidly adapting populations
It is argued that lineages trace back to a small pool of highly fit ancestors, in which almost simultaneous coalescence of more than two lineages frequently occurs, and should be considered as a null model for adapting populations.
Coalescent Processes When the Distribution of Offspring Number Among Individuals Is Highly Skewed
A complex set of scaling relationships between mutation and reproduction in a simple model of a population suggests the presence of rare reproduction events in which ∼8% of the population is replaced by the offspring of a single individual.
It is proved that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large, and a general bound on the sample size sufficient for identifiability is obtained.
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.
This work develops a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies, which is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum.
The general coalescent with asynchronous mergers of ancestral lines
  • S. Sagitov
  • Mathematics
    Journal of Applied Probability
  • 1999
Take a sample of individuals in the fixed-size population model with exchangeable family sizes. Follow the ancestral lines for the sampled individuals backwards in time to observe the ancestral
Maximum likelihood estimation of population growth rates based on the coalescent.
This paper describes a method for co-estimating 4Nemu (four times the product of effective population size and neutral mutation rate) and population growth rate from sequence samples using Metropolis-Hastings sampling and suggests that sampling additional unlinked loci is much more effective in reducing the bias than increasing the number or length of sequences from the same locus.
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data
This work develops a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation.
Population growth makes waves in the distribution of pairwise genetic differences.
Episodes of population growth and decline leave characteristic signatures in the distribution of nucleotide (or restriction) site differences between pairs of individuals. These signatures appear in