Pavel A. Pevzner

Learn More
The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform(More)
Genomes frequently evolve by reversals &rgr;(<italic>i,j</italic>) that transform a gene order &#960;<subscrpt>1</subscrpt> &#8230; &#960;<subscrpt><italic>i</italic></subscrpt>&#960;<subscrpt><italic>i</italic>+1</subscrpt> &#8230; &#960;<subscrpt><italic>j</italic>-1</subscrpt>&#960;<subscrpt><italic>j</italic></subscrpt> &#8230;(More)
Signal finding (pattern discovery in unaligned DNA sequences) is a fundamental problem in both computer science and molecular biology with important applications in locating regulatory sites and drug target identification. Despite many studies, this problem is far from being resolved: most signals in DNA sequences are so complicated that we don't yet have(More)
Sequence comparison in molecular biology is in the beginning of a major paradigm shift a shift from gene comparison based on local mutations to chromosome comparison based on global rearrangements. In the simplest f o r m the problem of gene rearrangements corresponds to sorting by reversals, i.e. sorting of an array using reversals of arbitrary fragments.(More)
For the last 20 years, fragment assembly in DNA sequencing followed the "overlap-layout-consensus" paradigm that is used in all currently available assembly tools. Although this approach proved useful in assembling clones, it faces difficulties in genomic shotgun assembly. We abandon the classical "overlap-layout-consensus" approach in favor of a new euler(More)
Sequence comparison in computational molecular biology is a powerful tool for deriving evolutionary and functional relationships between genes. However, classical alignment algorithms handle only local mutations (i.e., insertions, deletions, and substitutions of nucleotides) and ignore global rearrangements (i.e., inversions and transpositions of long(More)
Recent progress in genome-scale sequencing and comparative mapping raises new challenges in studies of genome rearrangements. Although the pairwise genome rearrangement problem is well-studied, algorithms for reconstructing rearrangement scenarios for multiple species are in great need. The previous approaches to multiple genome rearrangement problem were(More)
Many people including ourselves believe that transformations of humans into mice happen only in fairy tales However despite some di erences in appearance and habits men and mice are genetically very similar In the pioneering paper Nadeau and Taylor estimated that surprisingly few genomic rearrangements happened since the divergence of human and mouse(More)
MOTIVATION De novo repeat family identification is a challenging algorithmic problem of great practical importance. As the number of genome sequencing projects increases, there is a pressing need to identify the repeat families present in large, newly sequenced genomes. We develop a new method for de novo identification of repeat families via extension of(More)