The Generation of Strings from a CFG using a Functional Language

Abstract

It is common to describe the input to many programs and systems in terms of a context free grammar (CFG). In order to test programs that process strings generated from such grammars it would be extremely useful to have eeective methods of generating strings from the grammar itself. This paper explores a number of interesting issues that arise in generating such strings in the context of functional programming languages. A number of features commonly provided in functional languages, such as lazy evaluation and innnite data structures, along with the ease with which memoization of function can be implemented are surprisingly useful in the solution of this problem. The paper presents two distinct solutions to this problem. The rst presents a method that generates all possible strings in order of increasing length. The second shows how to generate strings from the grammar at random such that all strings of length n are produced with equal probability. It is often necessary in practice to perform manipulations by hand on a CFG in order to convert it into some particular form. For example to use recursive descent parsing techniques it is necessary to remove left recursion. A key requirement of the second random generation approach are counts of the number of strings of length m n generated by the grammar. We show how these counts can be useful in providing a partial check that such manipulations of a grammar have been performed without (accidentally) changing the language generated.

Cite this paper

@inproceedings{McKenzie1997TheGO, title={The Generation of Strings from a CFG using a Functional Language}, author={Bruce McKenzie}, year={1997} }