Programming Techniques: Regular expression search algorithm

  title={Programming Techniques: Regular expression search algorithm},
  author={Ken Thompson},
  journal={Commun. ACM},
A method for locating specific character strings embedded in character text is described and an implementation of this method in the form of a compiler is discussed. [...] Key Method The object program then accepts the text to be searched as input and produces a signal every time an embedded string in the text matches the given regular expression. Examples, problems, and solutions are also presented.Expand
Fast Regular Expression Search
A new algorithm to search regular expressions is presented, which is able to skip text characters, and is fast, the fastest one in many cases of interest. Expand
A regular expression pattern matching processor for APL
This paper discusses classical regular expressions and their extension into the domain of APL in terms of locator templates, which describe patterns to be searched for, and action templates,Which specify an action to be performed when a match is encountered. Expand
Fast text searching for regular expressions or automaton searching on tries
This work obtains searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression. Expand
A compact function for regular expression pattern matching
This paper describes a simple compiler and interpreter for a finite state machine recognizer of patterns represented by regular expressions to be compact and to require little work space. Expand
A fast regular expression indexing engine
The design, architecture, and lessons learned from the implementation of a fast regular-expression indexing engine FREE show orders of magnitude performance improvement in certain cases over standard regular expression matching systems, such as lex, awk and grep. Expand
Pattern matching in strings
Most formal systems handling strings can be considered as defining patterns in strings, especially for formal grammars and especially for regular expressions which provide a technique to specify simple patterns. Expand
Efficient string matching
A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10. Expand
Fast and compact regular expression matching
This work shows how to improve the space and/or remove a dependency on the alphabet size for each problem using either an improved tabulation technique of an existing algorithm or by combining known algorithms in a new way. Expand
Efficient tree construction for formal language query processing
The proposed algorithms are a preprocessing step for search algorithms which bypass the construction of a separate automaton for a given query. Expand
Regular Expression Search on Compressed Text
An algorithm for searching regular expression matches in compressed text that requires up to 25% less time than the state of the art and defines efficient data structures that yield nearly optimal complexity bounds. Expand


Derivatives of Regular Expressions
In this paper the notion of a derivative of a regular expression is introduced atld the properties of derivatives are discussed and this leads, in a very natural way, to the construction of a state diagram from a regularexpression containing any number of logical operators. Expand
Representation of Events in Nerve Nets and Finite Automata
This memorandum is devoted to an elementary exposition of the problems and of results obtained on the McCulloch-Pitts nerve net during investigations in August 1951. Expand
IBM 7094 principles of operation. File No. 7094-01, Form A22-6703-1
  • IBM 7094 principles of operation. File No. 7094-01, Form A22-6703-1