Space lower bounds for online pattern matching

@article{Clifford2013SpaceLB,
  title={Space lower bounds for online pattern matching},
  author={Rapha{\"e}l Clifford and Markus Jalsenius and Ely Porat and Benjamin Sach},
  journal={Theor. Comput. Sci.},
  year={2013},
  volume={483},
  pages={68-74}
}
Pattern Matching in Multiple Streams
We investigate the problem of deterministic pattern matching in multiple streams. In this model, one symbol arrives at a time and is associated with one of s streaming texts. The task at each time
Streaming Pattern Matching with d Wildcards
TLDR
Two new algorithms for the d wildcard pattern matching problem in the streaming model are introduced, one of which is a randomized Monte Carlo algorithm that uses O(d+\log m) worst-case time per character and O (dlogm) words of space.
The k-mismatch problem revisited
TLDR
The complexity of one of the most basic problems in pattern matching, the k-mismatch problem, is revisited and a randomised and online algorithm which runs in the same time complexity but requires only O(k2 polylog m) space in total is given.
The streaming k-mismatch problem
TLDR
An O(k\log^3 n\log\frac{n}{k})-time streaming algorithm for the streaming-mismatch streaming problem which uses only bits of space and is within logarithmic factors of optimal and approximately a factor of $k$ improvement over the previous record.
Online Pattern Matching for String Edit Distance with Moves
TLDR
An online ESP (OESP) is presented that enables an online pattern matching for EDM and OESP directly encodes the parse tree into a succinct representation by leveraging the idea behind recent results of a dynamic succinct tree.
Streaming k-mismatch with data recovery and applications
TLDR
The k-mismatch problem is revisited, and the first streaming algorithms for pattern matching on weighted strings, which can be either weighted or regular strings, are developed, are randomised and correct with high probability.
Periodicity in Data Streams with Wildcards
TLDR
This work investigates the problem of detecting periodic trends within a string S, arriving in the streaming model, containing at most k wildcard characters, and presents a two-pass streaming algorithm that computes wildcard-periods of S using O ( k 3 polylog n ) $\mathcal {O}(k^{3} \text {polylog} n)$ bits of space.
Periodicity in Data Streams with Wildcards
TLDR
A two-pass streaming algorithm that computes wildcard-periods of $S$ using $\mathcal{O}(k^3\,\mathsf{polylog}\,n)$ bits of space, while it is shown that this problem cannot be solved in sublinear space in one pass.
Streaming K-Mismatch with Error Correcting and Applications
TLDR
A new streaming algorithm for the k-Mismatch problem, one of the most basic problems in pattern matching, and a series of streaming algorithms for pattern matching on weighted strings, which are a commonly used representation of uncertain sequences in molecular biology.
...
1
2
...

References

SHOWING 1-10 OF 15 REFERENCES
Faster pattern matching with character classes using prime number encoding
Exact and Approximate Pattern Matching in the Streaming Model
  • Benny Porat, E. Porat
  • Mathematics, Computer Science
    2009 50th Annual IEEE Symposium on Foundations of Computer Science
  • 2009
We present a fully online randomized algorithm for the classical pattern matching problem that uses merely O(log m) space, breaking the O(m) barrier that held for this problem for a long time. Our
String Matching Under a General Matching Relation
TLDR
This work considers a general string matching problem in which an arbitrary many-to-many matching relation is specified and those positions in a text t are sought at which the pattern p matches under this relation.
Pattern matching with swaps
TLDR
This paper shows the first algorithm that solves the pattern matching with swaps problem in time O(mn) time and presents an algorithm whose time complexity is O for a general alphabet /spl Sigma/, where /spl sigma/=min(m, |/spl Sigma/|).
The One-Way Communication Complexity of Hamming Distance
TLDR
This note gives a simple proof of a linear lower bound for the randomized one-way communication complexity of the Hamming distance problem using a simple reduction from the indexing problem and avoids the VC-dimension arguments used in the previous paper.
Approximating edit distance efficiently
TLDR
Algorithms are developed that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than /spl lscr/, decide which of the two holds and develop an n/sup 3/7/-approximation quasilinear time algorithm.
Maintaining Stream Statistics over Sliding Windows
TLDR
The problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far, is considered, and it is shown that, using $O(\frac{1}{\epsilon} \log^2 N)$ bits of memory, the number of 1's can be estimated to within a factor of $1 + \ep silon$.
The communication complexity of the Hamming distance problem
Some complexity questions related to distributive computing(Preliminary Report)
  • A. Yao
  • Computer Science, Mathematics
    STOC
  • 1979
TLDR
The quantity of interest, which measures the information exchange necessary for computing f, is the minimum number of bits exchanged in any algorithm.
Unbiased bits from sources of weak randomness and probabilistic communication complexity
  • B. Chor, Oded Goldreich
  • Mathematics, Computer Science
    26th Annual Symposium on Foundations of Computer Science (sfcs 1985)
  • 1985
TLDR
It is shown that most Boolean functions have linear communication complexity in a very strong sense when used to extract almost unbiased and independent bits from the output of any two independent "probability-bounded" sources.
...
1
2
...