Abstract

Context-Free Grammars (CFGs) and Parsing Expression Grammars (PEGs) have several similarities and a few differences in both their syntax and semantics, but they are usually presented through formalisms that hinder a proper comparison. In this paper we present a new formalism for CFGs that highlights the similarities and differences between them. The new formalism borrows from PEGs the use of parsing expressions and the recognition-based semantics. We show how one way of removing non-determinism from this formalism yields a formalism with the semantics of PEGs. We also prove, based on these new formalisms, how LL(1) grammars define the same language whether interpreted as CFGs or as PEGs, and also show how strong-LL(k), right-linear, and LL-regular grammars have simple language-preserving translations from CFGs to PEGs. Once these classes of CFGs can be automatically translated to equivalent PEGs, we can reuse classic top-down grammars in PEG-based tools.

Highlights

  • Context-Free Grammars (CFGs) are the formalism of choice for describing the syntax of programming languages

  • We presented a new formalism for context-free grammars that is based on recognizing strings instead of generating them

  • We adopted a subset of the syntax of parsing expression grammars, and the notion of letting a grammar recognize just part of an input string, to purposefully get a definition for CFGs that is closer to Parsing Expression Grammars (PEGs), yet defines the same class of languages as traditional CFGs

Read more

Summary

Introduction

Context-Free Grammars (CFGs) are the formalism of choice for describing the syntax of programming languages. We show that we can transform any LL-regular grammar into a PEG that recognizes the same language: we first prove that right-linear grammars for languages with the prefix property, a property that is easy to achieve, have the same language whether interpreted as CFGs or as PEGs, use this result to build lookahead expressions for the alternatives of each non-terminal based on which regular partition this alternative falls. While LL(1) grammars are a proper subset of strong-LL(k) grammars, which are a proper subsets of LL-regular grammars, making the LL-regular transformation work on grammars belonging to these simpler classes, the simpler classes have more straightforward transformations which merit a separate treatment Given that these classes of top-down CFGs can be automatically translated into equivalent PEGs, we can reuse classic top-down grammars in PEG-based tools.

From CFGs to PEGs
From CFGs to PE-CFGs
1: Natural semantics of abstract syntax of parsing expressions is given below:
Correspondence between CFGs and PE-CFGs
From PE-CFGs to PEGs
Correspondence with Ford’s Defintion
Right-linear and LL-regular Grammars
Related Work
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.