Automatic generation of efficient lexical processors using finite state techniques

  title={Automatic generation of efficient lexical processors using finite state techniques},
  author={Walter L. Johnson and James H. Porter and Stephanie I. Ackley and Douglas T. Ross},
  journal={Commun. ACM},
The practical application of the theory of finite-state automata to automatically generate lexical processors is dealt with in this tutorial article by the use of the AED RWORD system, developed at M.I.T. as part of the AED-1 system. This system accepts as input descriptions of the multicharacter items or of words allowable in a language given in terms of a subset of regular expressions. The output of the system is a lexical processor which reads a string of characters and combines them into… 

Figures and Tables from this paper

Design of a microprogrammed lexical microprocessor

  • Y. Chu
  • Computer Science
    MICRO 8
  • 1975
The design of a lexical processor is presented, which is vertically microprogrammed for easier programming, and could be implemented as a microprocessor to be a member of a multi-microprocessor system for high-level languages.

A Language Independent Scanner Generator

A methodology for scanner generation that supports automatic generation of off the shelf scanners from specifications that is convenient because the user operates with meaningful constructs and no programming is required.

On the look-ahead problem in lexical analysis

  • Wuu Yang
  • Computer Science
    Acta Informatica
  • 2005
A new lexical analyzer makes use of the suffix finite automata to identify tokens and it can detect lexical errors at an earlier time than traditional lexical Analyzers.

Stream Processing using Grammars and Regular Expressions

This dissertation presents Kleenex, a language for expressing high-performance streaming string processing programs as regular grammars with embedded semantic actions, and its compilation to streaming string transducers with worst-case linear-time performance.

Construction of a Minimal Deterministic Finite Automaton from a Regular Expression

The main advantage of the minimal DFA construction algorithm is its minimal intermediate memory requirements and hence, the reduced time complexity.

Myths and Facts about the Efficient Implementation of Finite Automata and Lexical Analysis

Analysis of the algorithms as well as run-time statistics on cache misses and instruction frequency reveals substantive differences in code locality and certain kinds of overhead typical for specific implementation strategies.

Parsing with Neural and Finite Automata Networks: A Graph Grammar Approach

A twofold investigation on the use of graph grammar as it explores an attempt to use both aspects of graph grammars (to generate a valid language and to parse a language for its validity) for parsing with (i) neural networks and (ii) finite automata networks.

Control Flow Aspects of Semantics-Directed Compiling

  • R. Sethi
  • Computer Science, Linguistics
  • 1983
This paper is a demonstration of a semantics-directed compiler generator. We focus on the part of a compiler between syntax analysis and code generation. A language is specified by adding semantic

Efficient string matching

A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.



On Formalisms for Turing Machines

Turing's original quintuple formalism for an abstract computing machine is compared with the quadruple approach of Post and with some new alterr~atives, and some new alternative deft-nitions are introduced.

Automatic-programming-language translation through syntactical analysis

The methods and techniques described in the present discussion represent the interpretation and partial development of a concept originally due to E. T. Dickinson and are presented as a tutorial exposition of syntax-directe(1 autoInatie-progranuning-language translation with samples from aspeets of ALGOL.

A generalized technique for symbol manipulation and numerical calculation

An unusual use of index registers is described which provides a computer technique that appears to include all known symbol manipulation techniques as simple subcases and is ideally suited to both symbolic and numerical operations.

A new hierarchy of elementary functions

This paper contains two main results: a characterization of Ritchie's classes in a new and neat way and a measure of the computational complexity of a function f can be taken to be the least i such that fEHi.

Regular Expressions and State Graphs for Automata

Algorithms are presented for 1) converting a state graph describing the behavior of an automaton to a regular expression describing the behavior of the same automaton (section 2), and 2) for

Design of a separable transition-diagram compiler

A COBOL compiler design is presented which is compact enough to permit rapid, one-pass compilation of a large subset of COBOL on a moderately large computer. Versions of the same compiler for smaller

The AED approach to generalized computer-aided design

This paper has been written in response to a request for an up-to-date broad view of the approach to computer-aided design taken by the M.I.T. Computer-Aided Design Project. Included in the

Translator writing systems

A critical review of recent efforts to automate the writing of translators of programming languages is presented and various approaches to automating the postsyntactic aspects of translator writing are discussed.