Learn More
UNLABELLED We present a Markov chain Monte Carlo coalescent genealogy sampler, LAMARC 2.0, which estimates population genetic parameters from genetic data. LAMARC can co-estimate subpopulation Theta = 4N(e)mu, immigration rates, subpopulation exponential growth rates and overall recombination rate, or a user-specified subset of these parameters. It can(More)
Coalescent genealogy samplers attempt to estimate past qualities of a population, such as its size, growth rate, patterns of gene flow or time of divergence from another population, based on samples of molecular data. Genealogy samplers are increasingly popular because of their potential to disentangle complex population histories. In the last decade they(More)
Cancer is considered an outcome of decades-long clonal evolution fueled by acquisition of somatic genomic abnormalities (SGAs). Non-steroidal anti-inflammatory drugs (NSAIDs) have been shown to reduce cancer risk, including risk of progression from Barrett's esophagus (BE) to esophageal adenocarcinoma (EA). However, the cancer chemopreventive mechanisms of(More)
We propose a genealogy-sampling algorithm, Sequential Markov Ancestral Recombination Tree (SMARTree), that provides an approach to estimation from SNP haplotype data of the patterns of coancestry across a genome segment among a set of homologous chromosomes. To enable analysis across longer segments of genome, the sequence of coalescent trees is modeled via(More)
Software which simulates, infers, or analyzes ancestral recombination graphs (ARGs) faces the problem of communicating them. Existing formats omit information either about the location of recombinations along the chromosome or the position of recombinations relative to the branching topology. We present a specialization of GraphML, an XML-based standard for(More)
Accurate phylogenies are critical to taxonomy as well as studies of speciation processes and other evolutionary patterns. Accurate branch lengths in phylogenies are critical for dating and rate measurements. Such accuracy may be jeopardized by unacknowledged sequencing error. We use simulated data to test a correction for DNA sequencing error in maximum(More)
There has been much interest in detecting genomic identity by descent (IBD) segments from modern dense genetic marker data and in using them to identify human disease susceptibility loci. Here we present a novel Bayesian framework using Markov chain Monte Carlo (MCMC) realizations to jointly infer IBD states among multiple individuals not known to be(More)
When multiple samples are taken from the neoplastic tissues of a single patient, it is natural to compare their mutation content. This is often done by bulk genotyping of whole biopsies, but the chance that a mutation will be detected in bulk genotyping depends on its local frequency in the sample. When the underlying mutation count per cell is equal,(More)
Ancestral recombination graphs (ARGs) represent the history of portions of a genome with recombination. Attempts to infer ARGs have been hampered by the lack of an ARG comparison metric which could be used to measure how well inference succeeded. We propose a simple ARG comparison framework based on averaging standard tree comparison measures across either(More)