A new top-down parsing algorithm to accommodate ambiguity and left recursion in polynomial time

@article{Frost2006ANT,
  title={A new top-down parsing algorithm to accommodate ambiguity and left recursion in polynomial time},
  author={Richard A. Frost and Rahmatullah Hafiz},
  journal={ACM SIGPLAN Notices},
  year={2006},
  volume={41},
  pages={46-54}
}
Top-down backtracking language processors are highly modular, can handle ambiguity, and are easy to implement with clear and maintainable code. However, a widely-held, and incorrect, view is that top-down processors are inherently exponential for ambiguous grammars and cannot accommodate left-recursive productions. It has been known for many years that exponential complexity can be avoided by memoization, and that left-recursive productions can be accommodated through a variety of techniques… 
Efficient combinator parsing for natural-language.
TLDR
A new combinator-parsing algorithm is proposed, which is efficient, modular, accommodates all forms of CFG and represents all possible resulting parse-trees in a densely-compact format.
Executable attribute grammars for modular and efficient natural language processing
TLDR
A new modular top-down syntactic and semantic analysis system is proposed, which is efficient and accommodates all forms of CFGs, and provides notation to declaratively specify semantics by establishing arbitrary dependencies between attributes of syntactic categories to perform linguistically-motivated tasks.
Modular Parsers for Natural-Language Processing ( with proofs in the appendices )
TLDR
The approach to accommodate ambiguity and direct and indirect left recursion in polynomial time is extended to create compact polynomialsized representations of the potentially exponential number of parse trees which can be generated for highly-ambiguous languages.
Modular and Efficient Top-Down Parsing for Ambiguous Left-Recursive Grammars
TLDR
This paper combines aspects of previous approaches and presents a method by which parsers can be built as modular and efficient executable specifications of ambiguous grammars containing unconstrained left recursion.
The extension of deterministic cancellation parser to directly handle indirect and hidden left recursion
TLDR
The resulting parser is a deterministic cancellation parser with a recursive decent structure, which has more acceptance power, because of its capability of working with all kinds of left recursion without any need for transforming them into non left-recursive equivalents.
Packrat parsers can support left recursion
TLDR
This paper presents a modification to the memoization mechanism used by packrat parser implementations that makes it possible for them to support (even indirectly or mutually) left-recursive rules.
Pika parsing: reformulating packrat parsing as a dynamic programming algorithm solves the left recursion and error recovery problems
TLDR
The pika parser is a novel reformulation of packrat parsing as a dynamic programming algorithm, which requires parsing the input in reverse: bottom-up and right to left, rather than top-down and left to right, which enables optimal recovery from syntax errors.
Pika parsing: parsing in reverse solves the left recursion and error recovery problems
TLDR
The pika parser is presented, a novel reformulation of packrat parsing using dynamic programming to parse the input in reverse: bottom-up and right to left, rather than top-down and left to right, which enables direct and optimal recovery from syntax errors, which is a crucial property for building IDEs and compilers.
Tunnel Parsing with counted repetitions
TLDR
A new and efficient algorithm for parsing, called Tunnel Parsing, that parses from left to right on the basis of a context-free grammar without left recursion and rules that recognize empty words is described.
Exploration of conflict situations in deterministic cancellation parser
TLDR
This paper investigates situations that result in conflicts at which parsers cannot make a deterministic decision to continue the process of parsing and tries to categorize them.
...
...

References

SHOWING 1-10 OF 21 REFERENCES
Lazy recursive descent parsing for modular language implementation
TLDR
A variant of the well‐known recursive descent parsing technique is developed, based on the assumption that each non‐terminal of the language is implemented through a separate module, which allows a language to be implemented in small pieces which are easy to modify, replace, and reuse.
Techniques for Automatic Memoization with Applications to Context-Free Parsing
It is shown that a process similar to Earley's algorithm can be generated by a simple top-down backtracking parser, when augmented by automatic memoization. The memoized parser has the same
Guarded attribute grammars
  • R. Frost
  • Computer Science
    Softw. Pract. Exp.
  • 1993
TLDR
A novel technique has been discovered by which the non‐termination that would otherwise occur is avoided by ‘guarding’ top‐down left‐recurrent language processors by non‐left‐recursive recognizers.
Efficient Combinator Parsers
TLDR
It is shown how the speed of these parsers can be improved by one order of magnitude using continuations, which prevents the creation of intermediate data structures and reduces the complexity for deterministic parsers from polynomial to linear.
Higher-Order Functions for Parsing
  • G. Hutton
  • Computer Science
    J. Funct. Program.
  • 1992
TLDR
This work presents the basic method for combinator parsing, and a number of extensions, and addresses the special problems presented by white{ space, and parsers with separate lexical and syntactic phases.
Memoization in Top-Down Parsing
TLDR
A version of memoization suitable for continuation-passing style programs and when applied to a simple formalization of a top-down recognizer it yields a terminating parser.
Mimico: a Monad Combinator Parser Generator
TLDR
A compiler generator that outputs code based on the use of monadic combinators, Mimico provides an easy way of specifying the syntax and semantics of languages, and generates readable output in the form of Haskell programs.
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
TLDR
A Parsing Table Constructor for Nishida and Doshita's System, with examples of left-to-Right on-Line Parsing and Interactive/Personal Machine Translation.
The Functional Treatment of Parsing
TLDR
The aim of this monograph is to clarify the role of notation in the development of grammar and to provide a framework for the subsequent development of formal grammar-based criticism.
Mimico: A Monadic Combinator Compiler Generator
TLDR
A prototype of a compiler generator, called M mico, is described, that handles innnite look-ahead and left recursive context free grammars, and dyadic innx operator precedence and associativity.
...
...