Enhanced Covers of Regular and Indeterminate Strings Using Prefix Tables

  title={Enhanced Covers of Regular and Indeterminate Strings Using Prefix Tables},
  author={Ali Alatabbi and A. S. M. S. Islam and Mohammad Sohel Rahman and J. Simpson and W. F. Smyth},
A \itbf{cover} of a string $x = x[1..n]$ is a proper substring $u$ of $x$ such that $x$ can be constructed from possibly overlapping instances of $u$. A recent paper \cite{FIKPPST13} relaxes this definition --- an \itbf{enhanced cover} $u$ of $x$ is a border of $x$ (that is, a proper prefix that is also a suffix) that covers a {\it maximum} number of positions in $x$ (not necessarily all) --- and proposes efficient algorithms for the computation of enhanced covers. These algorithms depend on… Expand
Optimal Rank and Select Queries on Dictionary-Compressed Text
  • N. Prezza
  • Mathematics, Computer Science
  • CPM
  • 2019
The solutions are given in the form of a space-time trade-off that is more general than the one previously known for grammars and that improves existing bounds on LZ77-compressed text by a $\log\log n$ time-factor in \emph{select} queries. Expand
On approximate enhanced covers under Hamming distance
  • Ondrej Guth
  • Computer Science, Mathematics
  • Discret. Appl. Math.
  • 2020
In this paper, two more general notions based on enhanced covers are introduced: a k -approximate enhanced cover and a relaxed k-approximates enhanced cover, where a fixed maximum number of errors k under the Hamming distance is considered. Expand
String covering with optimal covers
It is shown that both the longest and the shortest optimal covers for a given string w of length n can be computed easily and efficiently in O (n log ⁡ n ) time and O ( n ) space. Expand
The Role of The Prefix Array in Sequence Analysis: A Survey
The prefix array was apparently first computed and used algorithmically in 1984, playing a pivotal role in an optimal algorithm to determine all the tandem repeats in a given (DNA or protein)Expand
On Indeterminate Strings Matching
This work establishes NP-hardness of the order-preserving version for r=2, thus solving a question explicitly stated by Henriques et al. Expand
Periods and borders of random words
It is shown that the asymptotic probability that a random word has a given maximal border length $k$ is a constant, depending only on$k$ and the alphabet size $\ell$, and a recurrence is given that allows us to determine these constants with any required precision. Expand
Frequency Covers for Strings
This paper proposes an effective, easily-computed form of quasi-periodicity in strings, the frequency cover, that is, the longest of those repeating substrings u of w, |u| > 1, that occurs the maximum number of times in w. Expand
An Overview of String Processing Applications to Data Analytics
Data analytics may conveniently be divided into four stages: preparation, preprocessing, analysis, and post-processing. Especially in the second and third of these, where the data is cleaned,Expand
Quasi-Periodicity in Streams
This work shows two streaming algorithms for computing the length of the shortest cover of a string of length n and shows that there is no sublinear-space streaming algorithm for Computing the lengthOf the shortest seed of astring. Expand


Computing covers using prefix tables
A linear-time algorithm to compute the cover array of regular x based on the prefix table of x is described and this result is extended to indeterminate strings. Expand
Finding All Covers of an Indeterminate String in O(n) Time on Average
The algorithm is applicable for both regular and indeterminate strings and uses pattern matching technique of the Aho-Corasick Automaton to compute all the covers of x from the border array. Expand
Enhanced string covering
New, simple, easily-computed, and widely applicable notions of string covering that provide an intuitive and useful characterisation of a string are proposed: the enhanced cover; the enhanced left cover; and the enhancedleft seed. Expand
New Perspectives on the Prefix Array
It is described twoθ (n )-time algorithms PL1 & PL2 tocompute POS/LEN for regular strings using only 8m bytes of storage in addition to the n bytes required for x, and an extension IPL of PL1 that computes POS/ LEN in O (n 2) worst-case time (though generally much faster), still using only 7mbytes of additional storage. Expand
Inferring an indeterminate string from a prefix graph
This paper shows, given a feasible array y, how to use P y to construct a lexicographically least indeterminate string on a minimum alphabet whose prefix table π = y. Expand
Indeterminate strings, prefix arrays & undirected graphs
It is shown using a graph model that every feasible array of integers is a prefix array of some (indeterminate or regular) string, and for regular strings corresponding to y, the model is used to provide a lower bound on the alphabet size. Expand
An Optimal Algorithm to Compute all the Covers of a String
The characterization theorem gives rise to a simple recursive algorithm which computes all the covers of x in time Θ(n) in terms of an easily computed normal form for x. Expand
New complexity results for the k-covers problem
It is shown that a minimum k-cover can be approximated to within a factor k in polynomial time, and that kCP is a special case of RVCP"k restricted to certain classes G"x","k of graphs that represent all strings x. Expand
A new approach to the periodicity lemma on strings with holes
An algorithm is described that, given the locations of the holes in a string, computes maximum-length substrings to which the periodicity lemma applies, in time proportional to the number of holes. Expand
An Optimal On-Line Algorithm To Compute All The Covers Of A String
Let x denote a given nonempty string of length n = jxj. A substring u of x is a cover of x if and only if every position of x lies within an occurrence of u within x. This paper extends the work ofExpand