Abstract

Syntactic discontinuities are very frequent in classical Latin and yet this data was never considered in debates on how expressive grammar formalisms need to be to capture natural languages. In this paper I show with treebank data that Latin frequently displays syntactic discontinuities that cannot be captured in standard mildly context-sensitive frameworks such as Tree-Adjoining Grammars or Combinatory Categorial Grammars. I then argue that there is no principled bound on Latin discontinuities but that they display a broadly Zipfian distribution where frequency drops quickly for the more complex patterns. Lexical-Functional Grammar can capture these discontinuities in a way that closely reflects their complexity and frequency distributions.

Highlights

  • Classical Latin, like classical Greek, is famous for its tolerance of syntactic discontinuities

  • There were two main responses to this discovery: either one tried to extend context-free formalisms as little as possible while achieving coverage of demonstrably non-context free phenomena such as the cross-serial dependencies from Dutch and Swiss German discussed in Shieber (1985), leading to so-called mildly context-sensitive formalisms such as (Lexicalized) Tree Adjoining Grammar (LTAG) and Combinatory Categorial Grammar (CCG); or one gave up completely on the concern about weak generative capacity, as in Lexical Functional Grammar (LFG) and Head Driven Phrase Structure Grammar (HPSG)

  • Overall in the Universal Dependencies (UD) treebanks, 0.6% of trees contain an edge of gap degree 2, but it is worth pointing out that almost three quarters of these trees are found in one of the Ancient Greek and Latin treebanks, which only make up roughly a tenth of the trees

Read more

Summary

Introduction

Classical Latin, like classical Greek, is famous for its tolerance of syntactic discontinuities. Multa gracilis te puer in rosa who.nom much.abl slender.nom you.acc boy.nom in rose.abl perfusus liquidis urget odoribus grato, Pyrrha, drenched.nom liquid.abl press.3sg.pres scents.abl delightful.abl Pyrrha sub antro in cave.abl ‘What slender boy, drenched with perfumes, presses you on a bed of roses, Pyrrha, under the delightful cave?’ (Horace, Carmina 1.5) This example features no less than four discontinuous noun phrases, as indicated with subscript indices on the words. (2) Mihi, Paulo, nullus est terror me.dat Paul.dat none.nom is fear.nom ‘I, Paul, have no fear.’ Examples such as (2) are reminiscent of quantifier float, a type of discontinuity which is found even in highly configurational languages such as English. No one attempted to build an argument based on classical data Another reason for suspicion, no doubt, was the lack of hard facts concerning the extent of syntactic discontinuity in a dead. As shown in Haug (2015), it used to be the case that scholars could not even agree on the frequencies of basic word orders in Ancient Greek – never mind providing an account of them

So how complex is natural language really?
Measuring discontinuity in a dependency treebank
Dependency structures and other grammatical formalisms
Complexity in LFG
Quantitative data
Examples
So how complex is Latin really?
There may be an intermediate formalism available
Findings
Conclusion and challenge

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.