Bayesian sampling of genomic rearrangement scenarios via double cut and join

  title={Bayesian sampling of genomic rearrangement scenarios via double cut and join},
  author={Istv{\'a}n Mikl{\'o}s and {\'E}ric Tannier},
  volume={26 24},
MOTIVATION When comparing the organization of two genomes, it is important not to draw conclusions on their modes of evolution from a single most parsimonious scenario explaining their differences. Better estimations can be obtained by sampling many different genomic rearrangement scenarios. For this problem, the Double Cut and Join (DCJ) model, while less relevant, is computationally easier than the Hannenhalli-Pevzner (HP) model. Indeed, in some special cases, the total number of DCJ sorting… 

Figures and Tables from this paper

Sampling and counting genome rearrangement scenarios

A Gibbs sampler for sampling most parsimonious labeling of evolutionary trees under the SCJ model and a mini-review about the state of the art of sampling and counting rearrangement scenarios, focusing on the reversal, DCJ and SCJ models are given.

Counting and sampling SCJ small parsimony solutions

Approximating the number of Double Cut-and-Join scenarios

Calibration of a probabilistic model of DNA evolution

This thesis describes a model of evolution where the DNA is represented by genes, and it describes how optimal parameters for this model can be found, and the main focus lies on the estimation of the number of chromosomal events.

Algorithms and methods for large-scale genome rearrangements identification

This thesis by compendium addresses the formal definition of SB starting from High-Scoring Segments Pairs (HSPs), and provides a solution for the granularity problem in the SB detection: starting with small and well-conserved SB and through rearrangement reconstruction gradually increasing the length of the SB.

Variants of the Consecutive Ones Property: Algorithms, Computational Complexity and Applications to Genomics

It is shown that the problem of optimizing over the set of repeat spanning intervals is NP-hard in general, and given an algorithm when the intervals are small.

Reconstructing the architecture of the ancestral amniote genome

This work analyzes 15 vertebrate genomes, including 12 amniotes and 3 teleost fishes, and infer a high-resolution genome organization of the amniote ancestral genome, composed of 39 ancestral linkage groups at a resolution of 100 kb, and shows that 36 out of 39 have maximum support.

Linearization of ancestral multichromosomal genomes

This work shows that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem.

The Inference of Gene Trees with Species Trees

Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction, and these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences.



Efficient Sampling of Parsimonious Inversion Histories with Application to Genome Rearrangement in Yersinia

It is found that on high-divergence data sets, MC4Inversion finds more optimal sorting paths per second than BADGER and theIS technique and simultaneously avoids bias inherent in the IS technique.

MCMC genome rearrangement

A Markov Chain Monte Carlo method to genome rearrangement based on a stochastic model of evolution, which can estimate the number of different evolutionary events needed to sort a signed permutation.

Efficient sorting of genomic permutations by translocation, inversion and block interchange

A universal double-cut-and-join operation that accounts for inversions, translocations, fissions and fusions, but also produces circular intermediates which can be reabsorbed, which converts one multi-linear chromosome genome to another in the minimum distance.

Counting All DCJ Sorting Scenarios

This work studies the solution space of the DCJ operation and gives an easy to compute formula that corresponds to the exact number of optimal DCJ sorting sequences to a particular subset of instances of the problem.

On Computing the Breakpoint Reuse Rate in Rearrangement Scenarios

The reuse rate is intimately linked to a particular rearrangement scenario, and that the reuse rate can vary from 0.89 to 1.51 for scenarios of the same length that transform the mouse genome into the human genome, where a rate of 1 indicates no reuse at all.

Dynamics of Genome Rearrangement in Bacterial Populations

These findings represent the first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions and insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes.

The Metropolized Partial Importance Sampling MCMC Mixes Slowly on Minimum Reversal Rearrangement Paths

It is proved that the relaxation time of the Markov chains walking on the optimal reversal sorting scenarios might grow exponentially with the size of the signed permutations, namely, with the number of syntheny blocks.

A bayesian analysis of metazoan mitochondrial genome arrangements.

The purpose of this article is to quantify the uncertainty among the relationships of metazoan phyla on the basis of mitochondrial genome arrangements while incorporating prior knowledge of the monophyly of various groups from other sources.

Additions, Losses, and Rearrangements on the Evolutionary Route from a Reconstructed Ancestor to the Modern Saccharomyces cerevisiae Genome

The principle of parsimony, applied to aligned synteny blocks from 11 yeast species, is used to infer the gene content and gene order that existed in the genome of an extinct ancestral yeast about 100 Mya, immediately before it underwent whole-genome duplication (WGD).

A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes

A general methodological framework for reconstructing ancestral genome segments from conserved syntenies in extant genomes is described and developed into a new reconstruction method considering conserved gene clusters with similar gene content, mimicking principles used in most cytogenetic studies, although on a different kind of data.