Longest Common Extensions in Sublinear Space

  title={Longest Common Extensions in Sublinear Space},
  author={Philip Bille and Inge Li G{\o}rtz and Mathias B{\ae}k Tejs Knudsen and Moshe Lewenstein and Hjalte Wedel Vildh{\o}j},
The longest common extension problem (LCE problem) is to construct a data structure for an input string \(T\) of length \(n\) that supports \({\mathrm {LCE}}(i,j)\) queries. Such a query returns the length of the longest common prefix of the suffixes starting at positions \(i\) and \(j\) in \(T\). This classic problem has a well-known solution that uses \(\mathcal {O}(n)\) space and \(\mathcal {O}(1)\) query time. In this paper we show that for any trade-off parameter \(1 \le \tau \le n\), the… 
Deterministic Sparse Suffix Sorting on Rewritable Texts
Given a rewritable text T of length n on an alphabet of size \(\sigma \), we propose an online algorithm computing the sparse suffix array and the sparse longest common prefix array of T in \(\mathop
Fast Longest Common Extensions in Small Space
This paper presents two fast and space-efficient solutions based on (Karp-Rabin) fingerprinting and sampling, and is the first result showing that it is possible to answer LCE queries in o(n) time while using only $\mathcal O(1)$ words on top of the space required to store the text.
Small-space encoding LCE data structure with constant-time queries
A data structure of O(z \tau^2 + \frac{n}{\tau})$ words of space is presented which answers LCE queries in O(1) time and can be built in $O(n \log \sigma)$ time.
Small-Space LCE Data Structure with Constant-Time Queries
A data structure of O(z \tau^2 + \frac{n}{\tau}) words of space which answers LCE queries in O(1) time and can be built in O (n \log \sigma) time, where 1 \leq \ tau \leqi \sqrt{n} is a parameter, z is the size of the Lempel-Ziv 77 factorization of w and \s Sigma is the alphabet size.
LCP Array Construction Using O(sort(n)) (or Less) I/Os
The suffix array, one of the most important data structures in modern string processing, needs to be augmented with the longest-common-prefix (LCP) array in many applications. Their construction is
In-Place Longest Common Extensions
The result is a powerful tool that can be used to efficiently solve in-place a wide variety of string processing problems and provides the first in- place algorithms to compute the LCP array in $\mathcal O(n\log n)$ expected time.
Fully Dynamic Data Structure for LCE Queries in Compressed Space
The signature encoding of $\mathcal{G}$ of size of $T$ has a capability to support LCE queries in $O(\log N + \log \ell \log^* M)$ time, and it is shown that this is the first fully dynamic LCE data structure.
Deterministic sub-linear space LCE data structures with efficient construction
A deterministic solution is proposed that achieves a similar space-time trade-off of $O(\tau\min\{\log\tau,\log\frac{n}{\tAU}\})$ query time using $O(n/ \tau)$ space, but significantly improve the construction time to $O (n\t Tau)$.
Faster Longest Common Extension Queries in Strings over General Alphabets
It is shown that a sequence of LCE queries for a string of size n over a general ordered alphabet can be realized in O(q \log n+n\log^*n) time making only $O(q+n)$ symbol comparisons.
Locally Consistent Parsing for Text Indexing in Small Space
It is shown how to use ideas based on the Locally Consistent Parsing technique, that was introduced by Sahinalp and Vishkin, in some non-trivial ways in order to improve the known results for the above problems.


Time-Space Trade-Offs for Longest Common Extensions
Borders for the longest common extension (LCE) problem are revisited and almost match the previously known bounds at the extremes when τ=1 or τ=n, providing the first smooth trade-offs for the LCE problem.
Fast Algorithms for Finding Nearest Common Ancestors
An algorithm for a random access machine with uniform cost measure (and a bound of $\Omega (\log n)$ on the number of bits per word) that requires time per query and preprocessing time is presented, assuming that the collection of trees is static.
Faster Sparse Suffix Sorting
An O(n) time Monte Carlo algorithm using O(b.log(b)) space and an O( n.log (b) space) time Las Vegas algorithm, both of which are a significant improvement over the best prior solutions.
Suffix arrays: a new method for on-line string searches
A new and conceptually simple data structure, called a suffixarray, for on-line string searches is introduced in this paper, and it is believed that suffixarrays will prove to be better in practice than suffixtrees for many applications.
Incremental String Comparison
This paper considers the following incremental version of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them, and obtains O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison.
A New Linear-Time ``On-Line'' Algorithm for Finding the Smallest Initial Palindrome of a String
The present algorithm, based on the Knuth-Morris-Prat algorithm, solves the problem of recognizing the initial leftmost nonvoid palindrome of a string in time proportional to the length N of thePalindrome, and an extension allows one to recognize the initial odd or even palindromes of length 2 or greater.
An O(n log n) Algorithm for Finding All Repetitions in a String
Approximate string matching: a simpler faster algorithm
We give two algorithms for finding all approximate matches of a pattern in a text, where the edit distance between the pattern and the matching text substring is at most k. The first algorithm, which
Range Non-overlapping Indexing and Successive List Indexing
Two natural variants of the indexing problem are presented, specifically a variation of the range searching for minimum problem of Lenhof and Smid, here considered over a grid, in what appears to be the first utilization of range search for minimum in an indexing-related context.
An Algorithm for Approximate Tandem Repeats
This paper considers two criterions of similarity: the Hamming distance (k mismatches) and the edit distance ( k differences) for a string S of length n and an integer k.