Almost-optimal fully LZW-compressed pattern matching

@article{Gsieniec1999AlmostoptimalFL,
  title={Almost-optimal fully LZW-compressed pattern matching},
  author={Leszek Gąsieniec and Wojciech Rytter},
  journal={Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)},
  year={1999},
  pages={316-325}
}
  • Leszek Gąsieniec, W. Rytter
  • Published 29 March 1999
  • Computer Science
  • Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)
Given two strings: pattern P and text T of lengths |P|=M and |T|=N, a string matching problem is to find all occurrences of pattern P in text T. A fully compressed string matching problem is the string matching problem with input strings P and T given in compressed forms p and t respectively, where |p|=m and |t|=n. We present first, almost-optimal, string matching algorithms for LZW-compressed strings running in: (1) O((n+m)log(n+m)) time on a single processor machine; and (2) O/sup /spl tilde… 

Figures from this paper

Optimal pattern matching in LZW compressed strings
TLDR
When t is compressed using the LZW method, it is able to detect the occurrence of s in optimal linear time, thus answering a question of Amir, Benson, and Farach.
Pattern Matching on Grammar-Compressed Strings in Linear Time
TLDR
An O(n+m) time algorithm that, given a context-free grammar of size n that produces a single string t and a pattern p of length m, decides whether p occurs in t as a substring is presented.
Faster Fully Compressed Pattern Matching by Recompression
  • Artur Jeż
  • Computer Science
    ACM Trans. Algorithms
  • 2015
TLDR
In this article, a fully compressed pattern matching problem is studied using a recently developed technique of local recompression: the SLPs are refactored so that substrings of the pattern and text are encoded in both SLPs in the same way.
Efficiency of Fast Parallel Pattern Searching in Highly Compressed Texts
TLDR
It is shown how to improve a naive straightforward NC algorithm and obtain almost optimal parallel RLZ-compressed matching applying tree-contraction techniques to directed acyclic graphs with polynomial tree-size.
Tying up the loose ends in fully LZW-compressed pattern matching
TLDR
An optimal linear time solution is developed for the case when p and t are compressed using the LZW method, which improves the previously known O((n+m)log( n+m)) time solution of Gasieniec and Rytter, and essentially closes the line of research devoted to tudying L ZW-compressed exact pattern matching.
A Fully Compressed Algorithm for Computing the Edit Distance of Run-Length Encoded Strings
TLDR
This paper presents its first “fully compressed” algorithm whose running time depends solely on the compressed string lengths, and yields the first fully compressed solution to approximate matching of a pattern of m runs in a text of n runs in O(mn2) time.
String Indexing with Compressed Patterns
TLDR
This paper considers the basic variant where the pattern is given in compressed form and the goal is to achieve query time that is fast in terms of the compressed size of the pattern, and develops several data structural techniques of independent interest.
Analyzing the performance differences between pattern matching and compressed pattern matching on texts
TLDR
The aim of the developed compression algorithm is to point out the difference in text processing between compressed and uncompressed text and to form opinions for another applications.
Fast Pattern Matching in Compressed Text using Wavelet Tree
TLDR
This paper presents an efficient algorithm (WBTC_WT) for matching a pattern directly inside the compressed text and finds that this algorithm outperforms the existing algorithms in most of the cases.
...
...

References

SHOWING 1-10 OF 40 REFERENCES
Constant-Time Randomized Parallel String Matching
TLDR
A constant-expected-time Las Vegas algorithm for computing the period of the pattern and all witnesses and thus for string matching itself, and an $\Omega(\log\log m)$ lower bound is known for deterministic algorithms.
An Improved Pattern Matching Algorithm for Strings in Terms of Straight-Line Programs
TLDR
An O(n2m2) time algorithm using O(nm) space is developed, which outputs a compact representation of all occurrences of P in T, which is superior to the algorithm proposed by Karpinski et al.
Let sleeping files lie: pattern matching in Z-compressed files
TLDR
This paper considers pattern matching without decompression in the UNIX Z-compression, a variant of the Lempel Ziv adaptive compression scheme, and shows how to modify the algorithms to achieve a trade-off between the amount of extra space used and the algorithm's time complexity.
Eecient Algorithms for Lempel-ziv Encoding
We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when
Randomized Efficient Algorithms for Compressed Strings: The Finger-Print Approach (Extended Abstract)
TLDR
The equality testing is reduced to the equivalence of certain context-free grammars generating single strings and the time complexity of several classical problems for texts is related to the complexity Eq(n) of equality-testing.
Pattern-Matching for Strings with Short Descriptions
TLDR
A textual problem for exponentially long strings is reduced here to simple arithmetics on integers with (only) linearly many bits, which allows to represent some sets of exponentially many positions in terms of feasibly many arithmetic progressions.
Efficient algorithms for Lempel-Ziv encoding
TLDR
It is shown that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size.
Fast Pattern Matching in Strings
TLDR
An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
Optimal Parallel Pattern Matching in Strings
...
...