Learn More
The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark(More)
Following the recent independent proofs of Immerman [SLAM J. nondeterministic space-bounded complexity classes are closed under complementation, two further applications of the inductive counting technique are developed. First, an errorless probabilistic algorithm for the undirected graph s-t connectivity problem that runs in O(log n) space and polynomial(More)
Understanding the mechanisms that determine the regulation of gene expression is an important and challenging problem. A fundamental subproblem is to identify DNA-binding sites for unknown regulatory factors, given a collection of genes believed to be coregulated, and given the noncoding DNA sequences near those genes. We present an enumerative statistical(More)
Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of homologous regulatory regions, usually collected from multiple species. It does so by identifying the best conserved motifs in those homologous regions. This note describes web software that has been designed specifically for this purpose, making use of the(More)
  • Heui-Dong Park, Kristi M Guinn, Maria I Harrell, Reiling Liao, Martin I Voskuil, Martin Tompa +2 others
  • 2003
Unlike many pathogens that are overtly harmful to their hosts, Mycobacterium tuberculosis can persist for years within humans in a clinically latent state. Latency is often linked to hypoxic conditions within the host. Among M. tuberculosis genes induced by hypoxia is a putative transcription factor, Rv3133c/DosR. We performed targeted disruption of this(More)
BACKGROUND This paper addresses the problem of discovering transcription factor binding sites in heterogeneous sequence data, which includes regulatory sequences of one or more genes, as well as their orthologs in other species. RESULTS We propose an algorithm that integrates two important aspects of a motif's significance - overrepresentation and(More)
Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguous subsequences having greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises in biological sequence analysis, where the high-scoring(More)