Practical Earley Parsing

@article{Aycock2002PracticalEP,
  title={Practical Earley Parsing},
  author={John Aycock and R. Nigel Horspool},
  journal={Comput. J.},
  year={2002},
  volume={45},
  pages={620-630}
}
Earley’s parsing algorithm isa general algorithm, ableto handleany context-free grammar. As with most parsing algorithms, however, the presence of grammar rules having empty right-hand sides complicates matters. By analyzing why Earley’s algorithm struggles with these grammar rules, we have devised a simple solution to the problem. Our empty-rule solution leads to a new type of finite automaton expressly suited for use in Earley parsers and to a new statement of Earley’s algorithm. We show that… Expand
Parsing with Scannerless Earley Virtual Machines
TLDR
A new, virtual machine based approach to parsing, heavily based on the original Earley parser is presented, and it is shown how to translate grammars into virtual machine instruction sequences that are then used by the parsing algorithm. Expand
Efficient Earley Parsing with Regular Right-hand Sides
TLDR
A new variant of the Earley parsing algorithm capable of efficiently supporting context-free grammars with regular right hand-sides is presented, and a theoretical framework for presenting the algorithm and for evaluating optimizations is included. Expand
A Comparison of CYK and Earley Parsing Algorithms
Parsers for programming languages are usually based on context-free grammars. So designing efficient parsing algorithms for context-free grammars is critical. CYK algorithm and Earley algorithm areExpand
On The Practice of B-ing Earley
TLDR
This thesis develops Earley's parsing algorithm from specification using the B-Method, and arrives at a list-processing formulation, in which the problems with -productions emerge and can be understood. Expand
A Modified Earley Parser for Huge Natural Language Grammars
TLDR
This paper aims to improve time and memory usage performances of Earley parser for grammars with a large number of rules by preferring radix tree representation for rules instead of list representation as in original Earleyparser. Expand
Early action in an Earley parser
TLDR
Safe Earley sets are identified, points during the recognition phase at which partial parse trees can be constructed; this means that semantic actions may be executed on the fly, resulting in a substantial savings of both space and time. Expand
Just-in-time Parsing with Scannerless Earley Virtual Machines
TLDR
It is shown how just-in-time compilation can be used to translate intermediate form grammars into native machine code to achieve improved parsing performance and an efficient method for lexical disambiguation is presented. Expand
Parsing reflective grammars
TLDR
This work demonstrates and proves the correctness of an algorithm for parsing reflective grammars, which can modify their own syntax during parsing, and proves that it performs asymptotically no worse than Earley's algorithm on ordinary context-free Grammars. Expand
Pushdown Automata and Parsing
TLDR
This chapter covers pushdown automata and parsing algorithms with emphasis on the application to syntax analysis, and presents the classical deterministic bottom-up and top-down parsing methods, adopting a novel approach that allows for the analysis of the languages defined by Extended BNF (EBNF) grammars. Expand
Simple, Functional, Sound and Complete Parsing for All Context-Free Grammars
TLDR
This work shows how to construct simple, sound and complete parser implementations directly from grammar specifications, for all context-free grammars, based on combinator parsing, and constructs a generic parser generator that is faster than the popular Happy parser generator. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
Directly-Executable Earley Parsing
TLDR
This work describes how to narrow the performance gap between general and deterministic parsers, constructing a directly-executable Earley parser that can reach speeds comparable to deterministic methods even on grammars for commonly-used programming languages. Expand
Construction of Efficient Generalized LR Parsers
TLDR
The result is a Generalized LR parsing algorithm working at complexity O(n3) in the worst case, which is achieved by the use of dynamic programming to represent the non-deterministic evolution of the stack instead of graph-structured stack representations, as has often been the case in previous approaches. Expand
Efficient parsing algorithms for general context-free parsers
TLDR
Empirical comparisons show the effectiveness of the new members of the family of parsing algorithms, which appear to be definitely better than Earley's algorithms, except for a few pathological grammars. Expand
Efficient incremental parsing for context-free languages
  • M. Ferro, B. Dion
  • Computer Science
  • Proceedings of 1994 IEEE International Conference on Computer Languages (ICCL'94)
  • 1994
TLDR
An incremental parsing algorithm based on dynamic programming techniques is described that appears to be superior to the other context-free analyzers and comparable to standard deterministic parsers, when the input string is not ambiguous. Expand
A Faster Earley Parser
TLDR
The new method retains the ability of Earley's method to parse using arbitrary context-free grammars, but by using precomputed LR(k) sets of items, it obtain much faster recognition speeds while also reducing memory requirements. Expand
An efficient context-free parsing algorithm
TLDR
A parsing algorithm which seems to be the most efficient general context-free algorithm known is described and appears to be superior to the top-down and bottom-up algorithms studied by Griffiths and Petrick. Expand
An Improved Context-Free Recognizer
TLDR
Surprisingly close connections between the Cocke-Kasami-Younger and Earley algorithms are established which reveal that the two algorithms are “almost” identical. Expand
The Structure of Shared Forests in Ambiguous Parsing
The Context-Free backbone of some natural language analyzers produces all possible CF parses as some kind of shared forest, from which a single tree is to be chosen by a disambiguation process thatExpand
Incremental generation of parsers
An LR-based parser generator for arbitrary context-free grammars is described, which generates parsers by need and processes grammar modifications by updating already existing parsers. We motivateExpand
Using Filters for the Disambiguation of Context-free Grammars
TLDR
A framework of filters is proposed to describe and compare a wide range of disambiguation problems in a parser-independent way and enables us to define several general properties of disAmbiguation methods. Expand
...
1
2
3
...