Learn More
The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark(More)
Assessing statistical significance of over-representation of exceptional words is becoming an important task in computational biology. We show on two problems how large deviation methodology applies. First, when some oligomer H occurs more often than expected, e.g. may be overrepresented, large deviations allow for a very efficient computation of the(More)
The prediction of viral zoonosis epidemics has become a major public health issue. A profound understanding of the viral population in key animal species acting as reservoirs represents an important step towards this goal. Bats harbor diverse viruses, some of which are of particular interest because they cause severe human diseases. However, little is known(More)
We study and compare two classes of statistical criteria to assess the significance of exceptional words. Indeed, the Z-score-like criteria, or the normal approximation that is a strict equivalent, suffer from several drawbacks in terms of sensitivity and specificity. Thanks to the combinatorial structure of words, a computation of the exact P-value has(More)
Various criteria have been defined to evaluate the significance of sets of words, the computation of them often being difficult. We provide explicit expressions for the waiting time in such a context. In order to assess the significance of a cluster of potential binding sites, we extend them to the co-occurrence problem. We point out that these criteria(More)
In mycobacteria, various type VII secretion systems corresponding to different ESX (ESAT-6 secretory) types, are contributing to pathogenicity, iron acquisition, and/or conjugation. In addition to the known chromosomal ESX loci, the existence of plasmid-encoded ESX systems was recently reported. To investigate the potential role of ESX-encoding plasmids on(More)
In mycobacteria, conjugation differs from the canonical Hfr model, but is still poorly understood. Here, we quantified this evolutionary processe in a natural mycobacterial population, taking advantage of a large clinical strain collection of the emerging pathogen Mycobacterium abscessus (MAB). Multilocus sequence typing confirmed the existence of three M.(More)
Elizabethkingia anophelis is an emerging pathogen involved in human infections and outbreaks in distinct world regions. We investigated the phylogenetic relationships and pathogenesis-associated genomic features of two neonatal meningitis isolates isolated 5 years apart from one hospital in Central African Republic and compared them with Elizabethkingia(More)
This note answers to Steve Finch's question at the end of his webpage 1] on \Feller's Coin Tossing Constants". The question is how to adapt known techniques (the classical Bernoulli model) in order to study motifs in a random text with respect to a Markovian model? 1. Feller's Coin Tossing Constants. Bernoulli Model. We are interested in the probability(More)