Bayesian sampling of genomic rearrangement scenarios via double cut and join

@article{Mikls2010BayesianSO,
  title={Bayesian sampling of genomic rearrangement scenarios via double cut and join},
  author={Istv{\'a}n Mikl{\'o}s and {\'E}ric Tannier},
  journal={Bioinformatics},
  year={2010},
  volume={26 24},
  pages={
          3012-9
        }
}
MOTIVATION When comparing the organization of two genomes, it is important not to draw conclusions on their modes of evolution from a single most parsimonious scenario explaining their differences. Better estimations can be obtained by sampling many different genomic rearrangement scenarios. For this problem, the Double Cut and Join (DCJ) model, while less relevant, is computationally easier than the Hannenhalli-Pevzner (HP) model. Indeed, in some special cases, the total number of DCJ sorting… 

Figures and Tables from this paper

Sampling and counting genome rearrangement scenarios

TLDR
A Gibbs sampler for sampling most parsimonious labeling of evolutionary trees under the SCJ model and a mini-review about the state of the art of sampling and counting rearrangement scenarios, focusing on the reversal, DCJ and SCJ models are given.

Counting and sampling SCJ small parsimony solutions

Approximating the number of Double Cut-and-Join scenarios

Calibration of a probabilistic model of DNA evolution

TLDR
This thesis describes a model of evolution where the DNA is represented by genes, and it describes how optimal parameters for this model can be found, and the main focus lies on the estimation of the number of chromosomal events.

Algorithms and methods for large-scale genome rearrangements identification

TLDR
This thesis by compendium addresses the formal definition of SB starting from High-Scoring Segments Pairs (HSPs), and provides a solution for the granularity problem in the SB detection: starting with small and well-conserved SB and through rearrangement reconstruction gradually increasing the length of the SB.

Variants of the Consecutive Ones Property: Algorithms, Computational Complexity and Applications to Genomics

TLDR
It is shown that the problem of optimizing over the set of repeat spanning intervals is NP-hard in general, and given an algorithm when the intervals are small.

Reconstructing the architecture of the ancestral amniote genome

TLDR
This work analyzes 15 vertebrate genomes, including 12 amniotes and 3 teleost fishes, and infer a high-resolution genome organization of the amniote ancestral genome, composed of 39 ancestral linkage groups at a resolution of 100 kb, and shows that 36 out of 39 have maximum support.

Linearization of ancestral multichromosomal genomes

TLDR
This work shows that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem.

Sampling solution traces for the problem of sorting permutations by signed reversals

TLDR
Qualitatively, the results show that, for testable-sized permutations, the algorithms DFALT and SWA produce distributions which approximate the reversal length distributions observed with a complete enumeration of the set of traces.

References

SHOWING 1-10 OF 38 REFERENCES

Efficient Sampling of Parsimonious Inversion Histories with Application to Genome Rearrangement in Yersinia

TLDR
It is found that on high-divergence data sets, MC4Inversion finds more optimal sorting paths per second than BADGER and theIS technique and simultaneously avoids bias inherent in the IS technique.

MCMC genome rearrangement

TLDR
A Markov Chain Monte Carlo method to genome rearrangement based on a stochastic model of evolution, which can estimate the number of different evolutionary events needed to sort a signed permutation.

Efficient sorting of genomic permutations by translocation, inversion and block interchange

TLDR
A universal double-cut-and-join operation that accounts for inversions, translocations, fissions and fusions, but also produces circular intermediates which can be reabsorbed, which converts one multi-linear chromosome genome to another in the minimum distance.

On Computing the Breakpoint Reuse Rate in Rearrangement Scenarios

TLDR
The reuse rate is intimately linked to a particular rearrangement scenario, and that the reuse rate can vary from 0.89 to 1.51 for scenarios of the same length that transform the mouse genome into the human genome, where a rate of 1 indicates no reuse at all.

Dynamics of Genome Rearrangement in Bacterial Populations

TLDR
These findings represent the first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions and insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes.

Combinatorial Structure of Genome Rearrangements Scenarios

TLDR
An exact formula is given for the number of double-cut-and-join (DCJ) rearrangement scenarios between two genomes and effective bijections are constructed between the set of scenarios that sort a component as well studied combinatorial objects such as parking functions, labeled trees, and prüfer codes.

The Metropolized Partial Importance Sampling MCMC Mixes Slowly on Minimum Reversal Rearrangement Paths

TLDR
It is proved that the relaxation time of the Markov chains walking on the optimal reversal sorting scenarios might grow exponentially with the size of the signed permutations, namely, with the number of syntheny blocks.

A bayesian analysis of metazoan mitochondrial genome arrangements.

TLDR
The purpose of this article is to quantify the uncertainty among the relationships of metazoan phyla on the basis of mitochondrial genome arrangements while incorporating prior knowledge of the monophyly of various groups from other sources.

Exploring the Solution Space of Sorting by Reversals, with Experiments and an Application to Evolution

TLDR
An algorithm which gives all the classes of solutions and counts the number of solutions in each class, with a better theoretical and practical complexity than the complete enumeration method is proposed.

Additions, Losses, and Rearrangements on the Evolutionary Route from a Reconstructed Ancestor to the Modern Saccharomyces cerevisiae Genome

TLDR
The principle of parsimony, applied to aligned synteny blocks from 11 yeast species, is used to infer the gene content and gene order that existed in the genome of an extinct ancestral yeast about 100 Mya, immediately before it underwent whole-genome duplication (WGD).