Learn More
UNLABELLED We present a Markov chain Monte Carlo coalescent genealogy sampler, LAMARC 2.0, which estimates population genetic parameters from genetic data. LAMARC can co-estimate subpopulation Theta = 4N(e)mu, immigration rates, subpopulation exponential growth rates and overall recombination rate, or a user-specified subset of these parameters. It can(More)
We describe a method for co-estimating 4Nemu (four times the product of effective population size and neutral mutation rate) and population growth rate from sequence samples using Metropolis-Hastings sampling. Population growth (or decline) is assumed to be exponential. The estimates of growth rate are biased upwards, especially when 4Nemu is low; there is(More)
We present a new way to make a maximum likelihood estimate of the parameter 4N mu (effective population size times mutation rate per site, or theta) based on a population sample of molecular sequences. We use a Metropolis-Hastings Markov chain Monte Carlo method to sample genealogies in proportion to the product of their likelihood with respect to the data(More)
We describe a method for co-estimating r = C/mu (where C is the per-site recombination rate and mu is the per-site neutral mutation rate) and Theta = 4N(e)mu (where N(e) is the effective population size) from a population sample of molecular data. The technique is Metropolis-Hastings sampling: we explore a large number of possible reconstructions of the(More)
Using simulated data, we compared five methods of phylogenetic tree estimation: parsimony, compatibility, maximum likelihood, Fitch-Margoliash, and neighbor joining. For each combination of substitution rates and sequence length, 100 data sets were generated for each of 50 trees, for a total of 5,000 replications per condition. Accuracy was measured by two(More)
We have developed a Bayesian version of our likelihood-based Markov chain Monte Carlo genealogy sampler LAMARC and compared the two versions for estimation of theta = 4N(e)mu, exponential growth rate, and recombination rate. We used simulated DNA data to assess accuracy of means and support or credibility intervals. In all cases the two methods had very(More)
Single nucleotide polymorphism (SNP) data can be used for parameter estimation via maximum likelihood methods as long as the way in which the SNPs were determined is known, so that an appropriate likelihood formula can be constructed. We present such likelihoods for several sampling methods. As a test of these approaches, we consider use of SNPs to estimate(More)
Coalescent genealogy samplers attempt to estimate past qualities of a population, such as its size, growth rate, patterns of gene flow or time of divergence from another population, based on samples of molecular data. Genealogy samplers are increasingly popular because of their potential to disentangle complex population histories. In the last decade they(More)
From 11 studies, a total of 1,792 Caucasian probands with insulin-dependent diabetes mellitus (IDDM) are analyzed. Antigen genotype frequencies in patients, transmission from affected parents to affected children, and the relative frequencies of HLA-DR3 and -DR4 homozygous patients all indicate that DR3 predisposes in a "recessive"-like and DR4 in a(More)
2 When population samples of molecular data, such as sequences, are taken, the members of the sample are related by a gene tree whose shape is affected by the population processes, such as genetic drift, change of population size, and migration. Genetic parameters such as recombination also affect that genealogy. Likelihood inference of these parameters(More)