#### Filter Results:

#### Publication Year

2008

2016

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

We present an algorithm which computes the Lempel-Ziv factorization of a word W of length n online in the following sense: it reads W starting from the left, and, after reading each r = O(log n) characters of W , updates the Lempel-Ziv factorization. The algorithm requires O(n) bits of space and O(n log 2 n) time. The basis of the algorithm is a sparse… (More)

We study a new variant of the string matching problem called cross-document string matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear-space… (More)

Given m documents of total length n, we consider the problem of finding a longest string common to at least d ≥ 2 of the documents. This problem is known as the longest common substring (LCS) problem and has a classic O(n) space and O(n) time solution (Weiner [FOCS'73], Hui [CPM'92]). However, the use of linear space is impractical in many applications. In… (More)

We present an improved wavelet tree construction algorithm and discuss its applications to a number of rank/select problems for integer keys and strings. Given a string of length n over an alphabet of size σ ≤ n, our method builds the wavelet tree in O(n log σ/ √ log n) time, improving upon the state-of-the-art algorithm by a factor of √ log n. As a… (More)

We consider the problem of dictionary matching in a stream. Given a set of strings, known as a dictionary, and a stream of characters arriving one at a time, the task is to report each time some string in our dictionary occurs in the stream. We present a randomised algorithm which takes O(log log(k + m)) time per arriving character and uses O(k log m) words… (More)

The Longest Common Substring problem is to compute the longest substring which occurs in at least d ≥ 2 of m strings of total length n. In this paper we ask the question whether this problem allows a deterministic time-space trade-off using O(n 1+ε) time and O(n 1−ε) space for 0 ≤ ε ≤ 1. We give a positive answer in the case of two strings (d = m = 2) and 0… (More)

We revisit the complexity of one of the most basic problems in pattern matching. In the k-mismatch problem we must compute the Hamming distance between a pattern of length m and every m-length substring of a text of length n, as long as that Hamming distance is at most k. Where the Hamming distance is greater than k at some alignment of the pattern and… (More)

Given a string S of length n, its maximal unbordered factor is the longest factor which does not have a border. In this work we investigate the relationship between n and the length of the maximal unbordered factor of S. We prove that for the alphabet of size σ ≥ 5 the expected length of the maximal unbordered factor of a string of length n is at least… (More)