• Corpus ID: 220661200

Statistical properties of the site-frequency spectrum associated with Lambda-coalescents

  title={Statistical properties of the site-frequency spectrum associated with Lambda-coalescents},
  author={Matthias C. F. Birkner and Jochen Blath and Bjarki Eldon},
  journal={arXiv: Populations and Evolution},
Statistical properties of the site frequency spectrum associated with Lambda-coalescents are our objects of study. In particular, we derive recursions for the expected value, variance, and covariance of the spectrum, extending earlier results of Fu (1995) for the classical Kingman coalescent. Estimating coalescent parameters introduced by certain Lambda-coalescents for datasets too large for full likelihood methods is our focus. The recursions for the expected values we obtain can be used to… 

Can the Site-Frequency Spectrum Distinguish Exponential Population Growth from Multiple-Merger Coalescents?

Estimates of statistical power indicate that exponential and algebraic growth can indeed be distinguished from multiple-merger coalescents, even for moderate sample sizes, if the number of segregating sites is high enough.

Multi-locus data distinguishes between population growth and multiple merger coalescents

  • Jere Koskela
  • Computer Science
    Statistical applications in genetics and molecular biology
  • 2018
Abstract We introduce a low dimensional function of the site frequency spectrum that is tailor-made for distinguishing coalescent models with multiple mergers from Kingman coalescent models with

Distinguishing multiple-merger from Kingman coalescence using two-site frequency spectra

A new method is presented based on the pointwise mutual information of the two-site frequency spectrum for pairs of linked sites that can detect when the genome-wide genetic diversity is inconsistent with the Kingman coalescent, rather than detecting outlier regions, as in selection scan methods.

The multifurcating skyline plot

This work applies the multifurcating skyline plot to a molecular clock phylogeny of 1,610 Ebola virus sequences from the 2014-2016 West African outbreak and shows that variance in the reproductive success of the pathogen through time can be estimated by combining the skyline plot with epidemiological case count data.

The expected neutral frequency spectrum of two linked sites

An exact, closed expression is presented for the expected neutral Site Frequency Spectrum for two neutral sites, 2-SFS, without recombination, which can be used to improve neutrality tests, composite likelihood and Poisson random field methods.

Coalescence 2.0: a multiple branching of recent theoretical developments and their applications

The relevance of multiple merger models for the detection of SNPs under selection in these species, for population genomics of very large sample size and advocate to potentially examine the conclusion of previous population genetics studies are discussed.

Genealogies and inference for populations with highly skewed offspring distributions

Inference methods under the infinitely-many sites model which allow both model selection and estimation of model parameters under these coalescents are discussed.

Genealogical structure changes as range expansions transition from pushed to pulled

The results show that range expansions provide a robust mechanism for generating different types of multiple mergers, which could be similar those observed in populations with strong selection or high fecundity, and makes precise predictions about the effects of population dynamics on genetic diversity at the expansion front, which are confirmed in simulations.


We quantify the behaviour at large scales of the beta coalescent Π = {Π(t), t ≥ 0} with parameters a, b > 0. Specifically, we study the rescaled block size spectrum of Π(t) and of its restriction



On Asymptotics of the Beta Coalescents

We show that the total number of collisions in the exchangeable coalescent process driven by the beta (1, b) measure converges in distribution to a 1-stable law, as the initial number of particles

Coalescent processes derived from some compound Poisson population models

A particular subclass of compound Poisson population models is analyzed. The models in the domain of attraction of the Kingman coalescent are characterized and it is shown that these models are never

Patterns of Neutral Diversity Under General Models of Selective Sweeps

A general model of recurrent selective sweeps in a coalescent framework is developed, one that generalizes the recurrent full-sweep model to the case where selected alleles do not sweep to fixation and shows that in a large population, only the initial rapid increase of a selected allele affects the genealogy at partially linked sites.

The asymptotic distribution of the length of Beta-coalescent trees

We derive the asymptotic distribution of the total length $L_n$ of a $\operatorname {Beta}(2-\alpha,\alpha)$-coalescent tree for $1<\alpha<2$, starting from $n$ individuals. There are two regimes: If

Population growth of human Y chromosomes: a study of Y chromosome microsatellites.

The finding of a recent common ancestor (probably in the last 120,000 years), coupled with a strong signal of demographic expansion in all populations, suggests either a recent human expansion from a small ancestral population, or natural selection acting on the Y chromosome.

Coalescents with Simultaneous Multiple Collisions

We study a family of coalescent processes that undergo ``simultaneous multiple collisions,'' meaning that many clusters of particles can merge into a single cluster at one time, and many such mergers

Testing for Neutrality in Samples With Sequencing Errors

It is shown that even with a moderate number of sequencing errors, neutrality tests based on the frequency spectrum reject neutrality, which implies that analyses of data sets with such errors will systematically lead to wrong inferences of evolutionary scenarios.

The Total External Branch Length of Beta-Coalescents†

It turns out that the fluctuations of the external branch length follow those of τn2−α over the entire parameter regime, where τn denotes the random number of coalescences that bring the n lineages down to one.