A Framework for Dynamic Parameterized Dictionary Matching

@inproceedings{Ganguly2016AFF,
  title={A Framework for Dynamic Parameterized Dictionary Matching},
  author={Arnab Ganguly and Wing-Kai Hon and Rahul Shah},
  booktitle={Scandinavian Workshop on Algorithm Theory},
  year={2016}
}
Two equal-length strings S and S' are a parameterized-match (p-match) iff there exists a one-to-one function that renames the characters in S to those in S'. Let P be a collection of d patterns of total length n characters that are chosen from an alphabet Sigma of cardinality sigma. The task is to index P such that we can support the following operations. * search(T): given a text T, report all occurrences such that there exists a pattern P_i in P that is a p-match with the substring T[j,j… 

Parameterized Text Indexing with One Wildcard

An interesting generalization of this problem, where the pattern contains one wildcard character ϕ ∉ Σ that matches with any other character in Σ, and it is shown that such queries can be answered in optimal O(p+occ) time per query using an O(n log n) space index.

Real-Time Streaming Multi-Pattern Search for Constant Alphabet

It is proved that for a constant size alphabet, there exists a randomized Monte-Carlo algorithm for the streaming dictionary matching problem that takes constant time per character and uses O(d log m) words of space, where m is the length of the longest pattern in the dictionary.

Parameterized Dictionary Matching with One Gap

This paper presents two algorithms solving the Prameterized Dictionary Matching with one Gap, stemming from cyber security, where the patterns are the malware sequences the authors want to detect in the text, and the necessity of a parameterized match is due to their encryption.

Dynamic Dictionary Matching in the Online Model

In the online version of the dictionary matching problem, the characters of T arrive online, one at a time, and the goal is to establish, immediately after every new character arrival, which of the patterns in \(\mathcal {D}\) are a suffix of the current text.

pBWT: Achieving Succinct Data Structures for Parameterized Pattern Matching and Related Problems

A new BWT-like transform is called pBWT, which is extended to obtain a succinct index for the Parameterized Dictionary Matching problem of Idury and Schaffer and introduces an n log σ + O(n)-bit index with O(|Plog σ+occ·log n logσ) query time.

Towards Optimal Approximate Streaming Pattern Matching by Matching Multiple Patterns in Multiple Streams

A new algorithm for the KMM problem in the streaming model that, up to poly-log factors, has the same bounds as the most recent results that use different techniques, and for most inputs, is significantly faster on average.

A Comparative Study of Dictionary Matching with Gaps: Limitations, Techniques and Challenges

A comparative survey of this line of research on several formal problems all within the broad scope of dictionary matching with gaps is supplied, the formally proven limitations of any solution suggested, the techniques developed to deal with the limitations and different problems, to supply complementary techniques, and to point out existing challenges still to be handled by future work.

Succinct Data Structures for Parameterized Pattern Matching and Related Problems

A chronology of key events and quotes from the 12-month campaign to elect US President Barack Obama in the 2016 presidential election can be found at www.score.gov.

References

SHOWING 1-10 OF 30 REFERENCES

Succinct Dictionary Matching with No Slowdown

The Aho-Corasick automaton can be represented in just m(log σ + O(1)) + O (d log(n/d) bits of space while still maintaining the ability to answer to queries in O(|T|+ occ) time.

Succinct Index for Dynamic Dictionary Matching

In this paper we revisit the dynamic dictionary matching problem, which asks for an index for a set of patterns P 1, P 2, ..., P k that can support the following query and update operations

Compressed Index for Dictionary Matching

This paper shows how to exploit a sampling technique to compress the existing O(n)-word index to an (n Hk (D) + o(n log sigma))-bit index with only a small sacrifice in search time.

Improved dynamic dictionary matching

A faster algorithm for dynamic string dictionary matching with bounded alphabets, and a novel method to efficiently manipulate failure links for two-dimensional patterns.

Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract)

An index structure is constructed that occupies only O(n) bits and compares favorably with inverted lists in space and achieves optimal O(m/log n) search time for sufficiently large m = ~(log a+~ n).

An Improved Query Time for Succinct Dynamic Dictionary Matching

In this work, we focus on building an efficient succinct dynamic dictionary that significantly improves the query time of the current best known results. The algorithm that we propose suffers from

Dynamic dictionary matching and compressed suffix trees

This paper presents the first O-bit representation of a suffix tree for a dynamic collection of texts whose total length is n, which supports insertion and deletion of a text in O(n) time, as well as all suffix tree traversal operations, including forward and backward suffix links.

Fully Functional Static and Dynamic Succinct Trees

A simple and flexible data structure is proposed, called the range min-max tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees.

Compressed Text Databases with Efficient Query Algorithms Based on the Compressed Suffix Array

A compressed text database based on the compressed suffix array is proposed, and the relationship with the opportunistic data structure of Ferragina and Manzini is shown.