Learn More
The proliferation of online text, such as on the World Wide Web and in databases, motivates the need for space-efficient index methods that support fast search. Consider a text T of n binary symbols to index. Given any query pattern P of m binary symbols, the goal is to search f?r P in T quickly, with T being fully scanned only once, nafiaely, when the(More)
We introduce a new text-indexing data structure, the <italic>String B-Tree</italic>, that can be seen as a link between some traditional external-memory and string-matching data structures. In a short phrase, it is a combination of B-trees and Patricia tries for internal-node indices that is made more effective by adding extra pointers to speed up search(More)
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has theoretical performance guarantees similar to those in our earlier work [16], yet represents an improvement in practice. Our experiments indicate that the resulting text index offers state-of-the-art compression. In particular, we require roughly 20% of the(More)
Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations. We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding(More)
We consider the problem of representing, in a compressed format, a bit-vector S of m bits with n 1s, supporting the following operations, where b ∈ {0, 1}: • rank b (S, i) returns the number of occurrences of bit b in the prefix S [1..i]; • select b (S, i) returns the position of the ith occurrence of bit b in S. Such a data structure is called fully(More)
Motif inference represents one of the most important areas of research in computational biology, and one of its oldest ones. Despite this, the problem remains very much open in the sense that no existing definition is fully satisfying, either in formal terms, or in relation to the biological questions that involve finding such motifs. Two main types of(More)