Abstract

Abstract The Tomita’s Generalized LR ( 1 ) parsing algorithm (GLR), later improved in many ways, runs in a linear time on LR ( 1 ) grammars, and degrades to a polynomial-time bound if the grammar is not deterministic. We address a useful feature not present in the current GLR ( 1 ) methods: the ability to accept grammars of the Extended BNF type (EBNF), the rules of which contain regular expressions. An EBNF grammar is conveniently represented by a collection of finite automata called a Transition Net (TN). We define, analyze and evaluate a new GLR ( 1 ) algorithm, called GELR, that combines the recent LR ( 1 ) parsing algorithm for TNs with the classical GLR data structures: the Graph-Structured Stack representing multiple stacks, and the Shared Packed Parse Forest for multiple syntax trees. The GELR algorithm is proved correct and an efficient implementation incorporating the state-of-the-art Right-Nulled parsing optimization is available. Experimental measures of the GELR parser size, speed and memory footprint are reported for current programming and web languages, and are compared with those of other parsing algorithms. The findings prove that directly parsing EBNF grammars does not penalize speed. Performance comparisons for different computer languages should also be of interest.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.