Abstract

We introduce Interleave-Disjunction-Lock parallel multiple context-free grammars (IDL-PMCFG), a novel grammar formalism designed to describe the syntax of free word order languages that allow for extensive interleaving of grammatical constituents. Though interleaved constituents, and especially the so-called hyperbaton, are common in several ancient (Classical Latin and Greek, Sanskrit...) and modern (Hungarian, Finnish...) languages, these syntactic structures are often difficult to express in existing formalisms. The IDL-PMCFG formalism combines Seki et al.’s parallel multiple context-free grammars (PMCFG) with Nederhof and Satta’s IDL expressions. We define the semantics of IDL-PMCFGs and study their expressivity, proving that IDL-PMCFG extends both PMCFG and IDL-CFG (context-free grammars equipped with IDL expressions) and that IDL-PMCFG parsing is mathrm {NP}-hard. We then introduce COMPĀ, a programming language extending Ranta’s Grammatical Framework (GF) and built as a high-level front-end formalism to IDL-PMCFG for practical grammar development. We present a parsing algorithm for IDL-PMCFG inspired by earlier Earley-style PMCFG parsing algorithms and Nederhof and Satta’s IDL graphs and give a worst-case estimate of its complexity as a function of several metrics on IDL expressions, the size of the input and a new notion of the G-density of a language.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call