Longest Common Extensions in Sublinear Space

@inproceedings{Bille2015LongestCE,
title={Longest Common Extensions in Sublinear Space},
author={Philip Bille and Inge Li G{\o}rtz and Mathias B{\ae}k Tejs Knudsen and Moshe Lewenstein and Hjalte Wedel Vildh{\o}j},
booktitle={CPM},
year={2015}
}
• Published in CPM 10 April 2015
• Computer Science
The longest common extension problem (LCE problem) is to construct a data structure for an input string $$T$$ of length $$n$$ that supports $${\mathrm {LCE}}(i,j)$$ queries. Such a query returns the length of the longest common prefix of the suffixes starting at positions $$i$$ and $$j$$ in $$T$$. This classic problem has a well-known solution that uses $$\mathcal {O}(n)$$ space and $$\mathcal {O}(1)$$ query time. In this paper we show that for any trade-off parameter $$1 \le \tau \le n$$, the…
27 Citations
Deterministic Sparse Suffix Sorting on Rewritable Texts
• Computer Science
LATIN
• 2016
Given a rewritable text T of length n on an alphabet of size $$\sigma$$, we propose an online algorithm computing the sparse suffix array and the sparse longest common prefix array of T in \(\mathop
Fast Longest Common Extensions in Small Space
• Computer Science
ArXiv
• 2016
This paper presents two fast and space-efficient solutions based on (Karp-Rabin) fingerprinting and sampling, and is the first result showing that it is possible to answer LCE queries in o(n) time while using only $\mathcal O(1)$ words on top of the space required to store the text.
Small-space encoding LCE data structure with constant-time queries
• Computer Science
ArXiv
• 2017
A data structure of O(z \tau^2 + \frac{n}{\tau})$words of space is presented which answers LCE queries in O(1) time and can be built in$O(n \log \sigma)$time. Small-Space LCE Data Structure with Constant-Time Queries • Computer Science MFCS • 2017 A data structure of O(z \tau^2 + \frac{n}{\tau}) words of space which answers LCE queries in O(1) time and can be built in O (n \log \sigma) time, where 1 \leq \ tau \leqi \sqrt{n} is a parameter, z is the size of the Lempel-Ziv 77 factorization of w and \s Sigma is the alphabet size. LCP Array Construction Using O(sort(n)) (or Less) I/Os • Computer Science SPIRE • 2016 The suffix array, one of the most important data structures in modern string processing, needs to be augmented with the longest-common-prefix (LCP) array in many applications. Their construction is In-Place Longest Common Extensions The result is a powerful tool that can be used to efficiently solve in-place a wide variety of string processing problems and provides the first in- place algorithms to compute the LCP array in$\mathcal O(n\log n)$expected time. Fully Dynamic Data Structure for LCE Queries in Compressed Space • Computer Science MFCS • 2016 The signature encoding of$\mathcal{G}$of size of$T$has a capability to support LCE queries in$O(\log N + \log \ell \log^* M)$time, and it is shown that this is the first fully dynamic LCE data structure. Deterministic sub-linear space LCE data structures with efficient construction • Computer Science CPM • 2016 A deterministic solution is proposed that achieves a similar space-time trade-off of$O(\tau\min\{\log\tau,\log\frac{n}{\tAU}\})$query time using$O(n/ \tau)$space, but significantly improve the construction time to$O (n\t Tau)$. Faster Longest Common Extension Queries in Strings over General Alphabets • Computer Science CPM • 2016 It is shown that a sequence of LCE queries for a string of size n over a general ordered alphabet can be realized in O(q \log n+n\log^*n) time making only$O(q+n)$symbol comparisons. Locally Consistent Parsing for Text Indexing in Small Space • Computer Science SODA • 2020 It is shown how to use ideas based on the Locally Consistent Parsing technique, that was introduced by Sahinalp and Vishkin, in some non-trivial ways in order to improve the known results for the above problems. References SHOWING 1-10 OF 25 REFERENCES Time-Space Trade-Offs for Longest Common Extensions • Computer Science CPM • 2012 Borders for the longest common extension (LCE) problem are revisited and almost match the previously known bounds at the extremes when τ=1 or τ=n, providing the first smooth trade-offs for the LCE problem. Fast Algorithms for Finding Nearest Common Ancestors • Computer Science, Mathematics SIAM J. Comput. • 1984 An algorithm for a random access machine with uniform cost measure (and a bound of$\Omega (\log n)\$ on the number of bits per word) that requires time per query and preprocessing time is presented, assuming that the collection of trees is static.
Faster Sparse Suffix Sorting
• Computer Science
STACS
• 2014
An O(n) time Monte Carlo algorithm using O(b.log(b)) space and an O( n.log (b) space) time Las Vegas algorithm, both of which are a significant improvement over the best prior solutions.
Suffix arrays: a new method for on-line string searches
• Computer Science
SODA '90
• 1990
A new and conceptually simple data structure, called a suffixarray, for on-line string searches is introduced in this paper, and it is believed that suffixarrays will prove to be better in practice than suffixtrees for many applications.
Incremental String Comparison
• Computer Science
SIAM J. Comput.
• 1998
This paper considers the following incremental version of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them, and obtains O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison.
A New Linear-Time On-Line'' Algorithm for Finding the Smallest Initial Palindrome of a String
The present algorithm, based on the Knuth-Morris-Prat algorithm, solves the problem of recognizing the initial leftmost nonvoid palindrome of a string in time proportional to the length N of thePalindrome, and an extension allows one to recognize the initial odd or even palindromes of length 2 or greater.
Approximate string matching: a simpler faster algorithm
• Computer Science
SODA '98
• 1998
We give two algorithms for finding all approximate matches of a pattern in a text, where the edit distance between the pattern and the matching text substring is at most k. The first algorithm, which
Range Non-overlapping Indexing and Successive List Indexing
• Computer Science