Compressed Subsequence Matching and Packed Tree Coloring

@article{Bille2015CompressedSM,
  title={Compressed Subsequence Matching and Packed Tree Coloring},
  author={Philip Bille and Patrick Hagge Cording and Inge Li G{\o}rtz},
  journal={Algorithmica},
  year={2015},
  volume={77},
  pages={336-348}
}
We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size n compressing a string of size N and a pattern string of size m over an alphabet of size $$\sigma $$σ, our algorithm uses $$O(n+\frac{n\sigma }{w})$$O(n+nσw) space and $$O(n+\frac{n\sigma }{w}+m\log N\log w\cdot occ)$$O(n+nσw+mlogNlogw·occ) or $$O(n+\frac{n\sigma }{w}\log w+m\log N\cdot occ)$$O(n+nσwlogw+mlogN·occ) time. Here w is the word size and occ is the number of minimal occurrences… 
Compressed Subsequence Matching and Packed Tree Coloring
TLDR
A new algorithm for subsequence matching in grammar compressed strings that uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string.
Rank, select and access in grammar-compressed strings
TLDR
This paper is the first to study the asymptotic complexity of rank and select in the grammar-compressed setting, and provides a hardness result showing that significantly improving the bounds the authors achieve would imply a major breakthrough on a hard graph-theoretical problem.
Balancing Straight-Line Programs for Strings and Trees
TLDR
The talk will explain a recent balancing result according to which a context-free grammar in Chomsky normal form of size m that produces a single string w of length n can be transformed in linear time into a Context-free Grammar-based compression formalism.
Compressed and Practical Data Structures for Strings
TLDR
This dissertation studies the random access problem with the nger search property, that is, the time for a random access query should depend on the distance between a speci ed index f, called the ngers, and the query index i, and improves a O(logN) query bound to O( logD) for the static variant and to O-logD + log logN for the dynamic variant.
Practical Grammar Compression Based on Maximal Repeats
TLDR
It is demonstrated that MR-RePair constructs more compact grammars than RePair does, especially for highly repetitive texts, because it considers the one-time substitution of the most frequent maximal repeats instead of the consecutive substitution ofThe most frequent pairs.

References

SHOWING 1-10 OF 41 REFERENCES
Faster subsequence recognition in compressed strings
TLDR
This work considers local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to Lempel–Ziv compression.
Compact Labeling Scheme for Ancestor Queries
TLDR
The main result in this paper is a labeling scheme with maximum label length $\log_2 n + \Oh(\sqrt{\log n})$.
Combinatorial Pattern Matching
Faster Subsequence and Don't-Care Pattern Matching on Compressed Texts
TLDR
This work presents an O(nm) time algorithm for solving all variations of the subsequence pattern matching problem, and improves the previous best known algorithm of Tiskin (Towards approximate matching in compressed strings: Local subsequence recognition, Proc. CSR 2011), which runs in O( nm log m) time.
The smallest grammar problem
TLDR
This paper shows that every efficient algorithm for the smallest grammar problem has approximation ratio at least 8569/8568 unless P=NP, and bound approximation ratios for several of the best known grammar-based compression algorithms, including LZ78, B ISECTION, SEQUENTIAL, LONGEST MATCH, GREEDY, and RE-PAIR.
Window-accumulated subsequence matching problem is linear
TLDR
A non-conventional kind of RAM is defined, the MP-RAMs which model more closely the microprocessor operations and an O (n) on-line algorithm is designed for solving the subsequence matching problem on MP- RAMs.
Window Subsequence Problems for Compressed Texts
TLDR
This work is searching for subsequences in a text which is compressed using Lempel-Ziv-like compression algorithms, without decompressing the text, and it would like the algorithms to be almost optimal, in the sense that they run in time O(m) where m is the size of the compressed text.
Directed acyclic subsequence graph - Overview
Finding Level-Ancestors in Trees
A data structure for dynamic trees
TLDR
An O(mn log n)-time algorithm is obtained to find a maximum flow in a network of n vertices and m edges, beating by a factor of log n the fastest algorithm previously known for sparse graphs.
...
...