Programming Techniques: Regular expression search algorithm

  title={Programming Techniques: Regular expression search algorithm},
  author={Ken Thompson},
  journal={Commun. ACM},
  • K. Thompson
  • Published 1 June 1968
  • Computer Science
  • Commun. ACM
A method for locating specific character strings embedded in character text is described and an implementation of this method in the form of a compiler is discussed. [] Key Method The object program then accepts the text to be searched as input and produces a signal every time an embedded string in the text matches the given regular expression. Examples, problems, and solutions are also presented.

Figures from this paper

Fast Regular Expression Search

A new algorithm to search regular expressions is presented, which is able to skip text characters, and is fast, the fastest one in many cases of interest.

A regular expression pattern matching processor for APL

This paper discusses classical regular expressions and their extension into the domain of APL in terms of locator templates, which describe patterns to be searched for, and action templates,Which specify an action to be performed when a match is encountered.

Fast text searching for regular expressions or automaton searching on tries

This work obtains searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression.

A compact function for regular expression pattern matching

This paper describes a simple compiler and interpreter for a finite state machine recognizer of patterns represented by regular expressions to be compact and to require little work space.

A fast regular expression indexing engine

The design, architecture, and lessons learned from the implementation of a fast regular-expression indexing engine FREE show orders of magnitude performance improvement in certain cases over standard regular expression matching systems, such as lex, awk and grep.

Pattern Matching in Strings

Most formal systems handling strings can be considered as defining patterns in strings, especially for formal grammars and especially for regular expressions which provide a technique to specify simple patterns.

Efficient string matching

A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.

Fast and compact regular expression matching

Efficient tree construction for formal language query processing

The proposed algorithms are a preprocessing step for search algorithms which bypass the construction of a separate automaton for a given query.

Regular Expression Search on Compressed Text

An algorithm for searching regular expression matches in compressed text that requires up to 25% less time than the state of the art and defines efficient data structures that yield nearly optimal complexity bounds.



Derivatives of Regular Expressions

In this paper the notion of a derivative of a regular expression is introduced atld the properties of derivatives are discussed and this leads, in a very natural way, to the construction of a state diagram from a regularexpression containing any number of logical operators.

Representation of Events in Nerve Nets and Finite Automata

This memorandum is devoted to an elementary exposition of the problems and of results obtained on the McCulloch-Pitts nerve net during investigations in August 1951.

IBM 7094 principles of operation. File No. 7094-01, Form A22-6703-1

  • IBM 7094 principles of operation. File No. 7094-01, Form A22-6703-1