#### Filter Results:

- Full text PDF available (22)

#### Publication Year

2008

2017

- This year (3)
- Last 5 years (25)
- Last 10 years (29)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

We present an improved wavelet tree construction algorithm and discuss its applications to a number of rank/select problems for integer keys and strings. Given a string of length n over an alphabet of size σ ≤ n, our method builds the wavelet tree in O(n log σ/ √ logn) time, improving upon the state-of-the-art algorithm by a factor of √ log n. As a… (More)

We revisit the complexity of one of the most basic problems in pattern matching. In the k-mismatch problem we must compute the Hamming distance between a pattern of length m and every m-length substring of a text of length n, as long as that Hamming distance is at most k. Where the Hamming distance is greater than k at some alignment of the pattern and… (More)

- Tatiana A. Starikovskaya
- MFCS
- 2012

Given a string S of length n, its maximal unbordered factor is the longest factor which does not have a border. In this work we investigate the relationship between n and the length of the maximal unbordered factor of S. We prove that for the alphabet of size σ ≥ 5 the expected length of the maximal unbordered factor of a string of length n is at least… (More)

- Roman Kolpakov, Gregory Kucherov, Tatiana A. Starikovskaya
- 2011 First International Conference on Data…
- 2011

We consider a compact text index based on evenly spaced sparse suffix trees of a text [9]. Such a tree is defined by partitioning the text into blocks of equal size and constructing the suffix tree only for those suffixes that start at block boundaries. We propose a new pattern matching algorithm on this structure. The algorithm is based on a notion of… (More)

We study the following three problems of computing generic or discriminating words for a given collection of documents. Given a pattern P and a threshold d, we want to report (i) all longest extensions of P which occur in at least d documents, (ii) all shortest extensions of P which occur in less than d documents, and (iii) all shortest extensions of P… (More)

We study a new variant of the string matching problem called cross-document string matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear-space… (More)

We consider the problem of dictionary matching in a stream. Given a set of strings, known as a dictionary, and a stream of characters arriving one at a time, the task is to report each time some string in our dictionary occurs in the stream. We present a randomised algorithm which takes O(log log(k +m)) time per arriving character and uses O(k logm) words… (More)

Given m documents of total length n, we consider the problem of finding a longest string common to at least d ≥ 2 of the documents. This problem is known as the longest common substring (LCS) problem and has a classic O(n) space and O(n) time solution (Weiner [FOCS’73], Hui [CPM’92]). However, the use of linear space is impractical in many applications. In… (More)