A Framework for Dynamic Parameterized Dictionary Matching

@inproceedings{Ganguly2016AFF,
  title={A Framework for Dynamic Parameterized Dictionary Matching},
  author={Arnab Ganguly and Wing-Kai Hon and Rahul Shah},
  booktitle={SWAT},
  year={2016}
}
Two equal-length strings S and S' are a parameterized-match (p-match) iff there exists a one-to-one function that renames the characters in S to those in S'. Let P be a collection of d patterns of total length n characters that are chosen from an alphabet Sigma of cardinality sigma. The task is to index P such that we can support the following operations. * search(T): given a text T, report all occurrences such that there exists a pattern P_i in P that is a p-match with the substring T[j,j… 
A framework for designing space-efficient dictionaries for parameterized and order-preserving matching
TLDR
This paper presents indexes of the same sizes, but with slightly increased query time for order-preserving matching and parameterized matching, and considers this problem under the following definitions of matching.
Parameterized Text Indexing with One Wildcard
TLDR
An interesting generalization of this problem, where the pattern contains one wildcard character ϕ ∉ Σ that matches with any other character in Σ, and it is shown that such queries can be answered in optimal O(p+occ) time per query using an O(n log n) space index.
Real-Time Streaming Multi-Pattern Search for Constant Alphabet
TLDR
It is proved that for a constant size alphabet, there exists a randomized Monte-Carlo algorithm for the streaming dictionary matching problem that takes constant time per character and uses O(d log m) words of space, where m is the length of the longest pattern in the dictionary.
Parameterized Dictionary Matching with One Gap
TLDR
This paper presents two algorithms solving the Prameterized Dictionary Matching with one Gap, stemming from cyber security, where the patterns are the malware sequences the authors want to detect in the text, and the necessity of a parameterized match is due to their encryption.
Online Parameterized Dictionary Matching with One Gap
TLDR
This work defines and study the strict PDMOG problem, in which sub-patterns of the same dictionary pattern should be parameterized matched via the same bijection, which captures situations where sub- patterns of a dictionary pattern are encoded simultaneously.
A brief history of parameterized matching problems
Abstract Parameterized pattern matching is a string searching variant that was initially defined to detect duplicate code but later proved to support several other applications. In particular, two
Parameterized dictionary matching and recognition with one gap
TLDR
The paper presents two algorithms solving the Parameterized Dictionary Matching with One Gap, for dictionaries with non-uniformly bounded gaps, and suggests the related problem of Parameterize Dictionary Recognition with one Gap, which requires reporting a single parameterized appearance of each gapped pattern.
Dynamic Dictionary Matching in the Online Model
TLDR
In the online version of the dictionary matching problem, the characters of T arrive online, one at a time, and the goal is to establish, immediately after every new character arrival, which of the patterns in \(\mathcal {D}\) are a suffix of the current text.
pBWT: Achieving Succinct Data Structures for Parameterized Pattern Matching and Related Problems
TLDR
A new BWT-like transform is called pBWT, which is extended to obtain a succinct index for the Parameterized Dictionary Matching problem of Idury and Schaffer and introduces an n log σ + O(n)-bit index with O(|Plog σ+occ·log n logσ) query time.
Towards Optimal Approximate Streaming Pattern Matching by Matching Multiple Patterns in Multiple Streams
TLDR
A new algorithm for the KMM problem in the streaming model that, up to poly-log factors, has the same bounds as the most recent results that use different techniques, and for most inputs, is significantly faster on average.
...
1
2
...

References

SHOWING 1-10 OF 37 REFERENCES
Succinct Dictionary Matching with No Slowdown
TLDR
The Aho-Corasick automaton can be represented in just m(log σ + O(1)) + O (d log(n/d) bits of space while still maintaining the ability to answer to queries in O(|T|+ occ) time.
Succinct Index for Dynamic Dictionary Matching
In this paper we revisit the dynamic dictionary matching problem, which asks for an index for a set of patterns P 1, P 2, ..., P k that can support the following query and update operations
Dynamic Dictionary Matching
TLDR
An algorithm is presented that performs any sequence of the following operations in the given time bounds that can insert a new pattern into the dictionary or delete a pattern from it.
Compressed Index for Dictionary Matching
TLDR
This paper shows how to exploit a sampling technique to compress the existing O(n)-word index to an (n Hk (D) + o(n log sigma))-bit index with only a small sacrifice in search time.
Improved dynamic dictionary matching
TLDR
A faster algorithm for dynamic string dictionary matching with bounded alphabets, and a novel method to efficiently manipulate failure links for two-dimensional patterns.
Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract)
TLDR
An index structure is constructed that occupies only O(n) bits and compares favorably with inverted lists in space and achieves optimal O(m/log n) search time for sufficiently large m = ~(log a+~ n).
Dynamic Entropy-Compressed Sequences and Full-Text Indexes
TLDR
Given a sequence of n bits with binary zero-order entropy H0, this result becomes the first entropy-bound dynamic data structure for rank and select over bit sequences, and is used to build a dynamic full-text self-index for a collection of texts over an alphabet of size σ.
An Improved Query Time for Succinct Dynamic Dictionary Matching
In this work, we focus on building an efficient succinct dynamic dictionary that significantly improves the query time of the current best known results. The algorithm that we propose suffers from
Dynamic dictionary matching and compressed suffix trees
TLDR
This paper presents the first O-bit representation of a suffix tree for a dynamic collection of texts whose total length is n, which supports insertion and deletion of a text in O(n) time, as well as all suffix tree traversal operations, including forward and backward suffix links.
Fully Functional Static and Dynamic Succinct Trees
TLDR
A simple and flexible data structure is proposed, called the range min-max tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees.
...
1
2
3
4
...