Fast Pattern Matching in Strings

  title={Fast Pattern Matching in Strings},
  author={Donald Ervin Knuth and James H. Morris and Vaughan R. Pratt},
  journal={SIAM J. Comput.},
An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings. The constant of proportionality is low enough to make this algorithm of practical use, and the procedure can also be extended to deal with some more general pattern-matching problems. A theoretical application of the algorithm shows that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be… 

Tables from this paper

String-Matching on Ordered Alphabets

On the Worst-Case Behavior of String-Searching Algorithms

  • R. Rivest
  • Computer Science, Mathematics
    SIAM J. Comput.
  • 1977
There do not exist pattern matching algorithms whose worst-case behavior is “sublinear” in n (that is, linear with constant less than one), in contrast with the situation for average behavior (the Boyer-Moore algorithm is known to be sublinear on the average).

Efficient Comparison Based String Matching

This work gives a linear-time algorithm that finds all occurrences of a pattern of length m in a text of length n in [formula] comparisons and establishes that, in general, searching for a long pattern is easier than searched for a short one.

A Method for Improving String Pattern Matching Machines

This correspondence describes an efficient string pattern matching machine to locate all occurrences of any of a finite number of keywords and phrases in an arbitrary text string. Some conditions are

Efficient String Matching with Don’t-Care Patterns

This paper considers the extension of the methods of Aho and Corasick to deal with patterns involving more expressive descriptions, such as don’t-care (wild-card) symbols, complements, etc.

Efficient Randomized Pattern-Matching Algorithms

We present randomized algorithms to solve the following string-matching problem and some of its generalizations: Given a string X of length n (the pattern) and a string Y (the text), find the first

Average-optimal string matching

Discovering Repetitions in Strings

The basic problem is to determine whether a pattern string x appears as a (contiguous) substring of a text y, i.e. whether for some strings u, v, the authors have y= uxv.

Fast Packed String Matching for Short Patterns

Specialized word-size packed string matching instructions, based on the Intel streaming SIMD extensions (SSE) technology, are used to design very fast string matching algorithms in the case of short patterns.




By exploiting the formal similarity of string-matching with integer multiplication, a new algorithm has been obtained with a running time which is only slightly worse than linear.

Rapid identification of repeated patterns in strings, trees and arrays

This paper describes a strategy for constructing efficient algorithms for solving two types of matching problems and develops explicit algorithms for these two problems applied to strings and arrays.

A New Linear-Time ``On-Line'' Algorithm for Finding the Smallest Initial Palindrome of a String

The present algorithm, based on the Knuth-Morris-Prat algorithm, solves the problem of recognizing the initial leftmost nonvoid palindrome of a string in time proportional to the length N of thePalindrome, and an extension allows one to recognize the initial odd or even palindromes of length 2 or greater.

Linear Pattern Matching Algorithms

A linear time algorithm for obtaining a compacted version of a bi-tree associated with a given string is presented and indicated how to solve several pattern matching problems, including some from [4] in linear time.

Implementation of the substring test by hashing

Tradeoff curves are developed to show minimal cost of file usage by grouping various partially combined indices under conditions offile usage with different fractions of retrieval and update.

The Design and Analysis of Computer Algorithms

This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.

A fast string searching algorithm

The algorithm has the unusual property that, in most cases, not all of the first <italic>i</italic) characters of a character string, “<italic>.” in another string, are inspected.

On converting on-line algorithms into real-time and on real-time algorithms for string-matching and palindrome recognition

This work uses a sufficient condition when an on-line algorithm can be transformed into a real-time algorithm to construct real- time algorithms for string-matching and palindrome recognition problems by random access machines and by Turing machines.

Synchronization of binary messages

  • E. Gilbert
  • Computer Science
    IRE Trans. Inf. Theory
  • 1960
If blocks of N digits are used, the prefix should be chosen to make large the number G(N) of different blocks which satisfy the constraints, and strengthening the prefix decreases the number of "message digits" which remain in the block but also relaxes the constraints.

On the Translation of Languages from Left to Right

  • D. Knuth
  • Computer Science, Linguistics
    Inf. Control.
  • 1965