Nikolay Vyahhi

Learn More
The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform(More)
SUMMARY Limitations of genome sequencing techniques have led to dozens of assembly algorithms, none of which is perfect. A number of methods for comparing assemblers have been developed, but none is yet a recognized benchmark. Further, most existing methods for comparing assemblies are only applicable to new assemblies of finished genomes; the problem of(More)
The "dark matter of life" describes microbes and even entire divisions of bacterial phyla that have evaded cultivation and have yet to be sequenced. We present a genome from the globally distributed but elusive candidate phylum TM6 and uncover its metabolic potential. TM6 was detected in a biofilm from a sink drain within a hospital restroom by analyzing(More)
Comparing strains within the same microbial species has proven effective in the identification of genes and genomic regions responsible for virulence, as well as in the diagnosis and treatment of infectious diseases. In this paper, we present Sibelia, a tool for finding synteny blocks in multiple closely related microbial genomes using iterative de Bruijn(More)
Multiple target tracking (MTT) is a well-studied technique in the eld of radar technology, which associates anonymized measurements with the appropriate object trajectories. This technique, however, suffers from combinatorial explosion, since each new measurement may potentially be associated with any of the existing tracks. Consequently, the complexity of(More)
We present C-Sibelia, a highly accurate and easy-to-use software tool for comparing two closely related bacterial genomes, which can be presented as either finished sequences or fragmented assemblies. C-Sibelia takes as input two FASTA files and produces: (1) a VCF file containing all identified single nucleotide variations and indels; (2) an XMFA file(More)
Next-generation sequencing (NGS) is increasingly being adopted as the backbone of biomedical research. With the commercialization of various affordable desktop sequencers, NGS will be reached by increasing numbers of cellular and molecular biologists, necessitating community consensus on bioinformatics protocols to tackle the exponential increase in(More)
The document ranking problem is a well-known problem which appears for example in search engines. Once a search engine has identified a set of potentially relevant documents it faces the problem to determine which of them are more relevant and which are less, it means to range this documents by relevancy. This is typically done by assigning a numerical(More)
A genome rearrangement scenario describes a series of chromosome fusion, fission, and translocation operations that suffice to rewrite one genome into another. Exact algorithmic methods for this important problem focus on providing one solution, while the set of distance-wise equivalent scenarios is very large. Moreover, no criteria for filtering for(More)