Almost-optimal fully LZW-compressed pattern matching

@article{Gsieniec1999AlmostoptimalFL,
title={Almost-optimal fully LZW-compressed pattern matching},
author={Leszek Gąsieniec and Wojciech Rytter},
journal={Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)},
year={1999},
pages={316-325}
}
• Published 29 March 1999
• Computer Science
• Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)
Given two strings: pattern P and text T of lengths |P|=M and |T|=N, a string matching problem is to find all occurrences of pattern P in text T. A fully compressed string matching problem is the string matching problem with input strings P and T given in compressed forms p and t respectively, where |p|=m and |t|=n. We present first, almost-optimal, string matching algorithms for LZW-compressed strings running in: (1) O((n+m)log(n+m)) time on a single processor machine; and (2) O/sup /spl tilde…
45 Citations

Figures from this paper

Optimal pattern matching in LZW compressed strings
When t is compressed using the LZW method, it is able to detect the occurrence of s in optimal linear time, thus answering a question of Amir, Benson, and Farach.
Pattern Matching on Grammar-Compressed Strings in Linear Time
• Computer Science
SODA
• 2022
An O(n+m) time algorithm that, given a context-free grammar of size n that produces a single string t and a pattern p of length m, decides whether p occurs in t as a substring is presented.
Faster Fully Compressed Pattern Matching by Recompression
• Artur Jeż
• Computer Science
ACM Trans. Algorithms
• 2015
In this article, a fully compressed pattern matching problem is studied using a recently developed technique of local recompression: the SLPs are refactored so that substrings of the pattern and text are encoded in both SLPs in the same way.
Efficiency of Fast Parallel Pattern Searching in Highly Compressed Texts
• Computer Science
MFCS
• 1999
It is shown how to improve a naive straightforward NC algorithm and obtain almost optimal parallel RLZ-compressed matching applying tree-contraction techniques to directed acyclic graphs with polynomial tree-size.
Tying up the loose ends in fully LZW-compressed pattern matching
An optimal linear time solution is developed for the case when p and t are compressed using the LZW method, which improves the previously known O((n+m)log( n+m)) time solution of Gasieniec and Rytter, and essentially closes the line of research devoted to tudying L ZW-compressed exact pattern matching.
A Fully Compressed Algorithm for Computing the Edit Distance of Run-Length Encoded Strings
• Computer Science
Algorithmica
• 2011
This paper presents its first “fully compressed” algorithm whose running time depends solely on the compressed string lengths, and yields the first fully compressed solution to approximate matching of a pattern of m runs in a text of n runs in O(mn2) time.
String Indexing with Compressed Patterns
• Computer Science
STACS
• 2020
This paper considers the basic variant where the pattern is given in compressed form and the goal is to achieve query time that is fast in terms of the compressed size of the pattern, and develops several data structural techniques of independent interest.
Analyzing the performance differences between pattern matching and compressed pattern matching on texts
• Computer Science
2013 International Conference on Electronics, Computer and Computation (ICECCO)
• 2013
The aim of the developed compression algorithm is to point out the difference in text processing between compressed and uncompressed text and to form opinions for another applications.
Fast Pattern Matching in Compressed Text using Wavelet Tree
• Computer Science
• 2018
This paper presents an efficient algorithm (WBTC_WT) for matching a pattern directly inside the compressed text and finds that this algorithm outperforms the existing algorithms in most of the cases.

References

SHOWING 1-10 OF 40 REFERENCES
Constant-Time Randomized Parallel String Matching
• Computer Science
SIAM J. Comput.
• 1997
A constant-expected-time Las Vegas algorithm for computing the period of the pattern and all witnesses and thus for string matching itself, and an $\Omega(\log\log m)$ lower bound is known for deterministic algorithms.
An Improved Pattern Matching Algorithm for Strings in Terms of Straight-Line Programs
• Computer Science
CPM
• 1997
An O(n2m2) time algorithm using O(nm) space is developed, which outputs a compact representation of all occurrences of P in T, which is superior to the algorithm proposed by Karpinski et al.
Let sleeping files lie: pattern matching in Z-compressed files
• Computer Science
SODA '94
• 1994
This paper considers pattern matching without decompression in the UNIX Z-compression, a variant of the Lempel Ziv adaptive compression scheme, and shows how to modify the algorithms to achieve a trade-off between the amount of extra space used and the algorithm's time complexity.
Eecient Algorithms for Lempel-ziv Encoding
• Computer Science
• 1996
We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when
Randomized Efficient Algorithms for Compressed Strings: The Finger-Print Approach (Extended Abstract)
• Computer Science
CPM
• 1996
The equality testing is reduced to the equivalence of certain context-free grammars generating single strings and the time complexity of several classical problems for texts is related to the complexity Eq(n) of equality-testing.
Pattern-Matching for Strings with Short Descriptions
• Computer Science, Mathematics
CPM
• 1995
A textual problem for exponentially long strings is reduced here to simple arithmetics on integers with (only) linearly many bits, which allows to represent some sets of exponentially many positions in terms of feasibly many arithmetic progressions.
Efficient algorithms for Lempel-Ziv encoding
• Computer Science
• 1996
It is shown that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size.
Fast Pattern Matching in Strings
• Computer Science
SIAM J. Comput.
• 1977
An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.