Abstract

Abstract Both areal and phylogenetic affiliation have been discussed as driving factors of the distribution of word order in the languages of the world. However, disentangling the interaction of these two factors is challenging. Here we take Indo-European as a test case. Word order in this family is largely homogeneous both within areas and within branches, which makes it difficult to assess which factor was more important in shaping the present-day distribution. To break out of this impasse we turn to corpus data and explicit statistical modeling. Building on a parallel corpus of movie subtitles, we investigate word order on the sentence level under stable pragmatic conditions. We measure the similarity of word order variation between pairs of languages with an information-theoretic distance metric. Using cluster analysis and variation partitioning methods these distance metrics show that phylogenetic distance predicts more variation than geographical distance, but the most important predictor is the shared fraction where phylogeny and area overlap. We conclude that word order has evolved along both dimensions and cannot be reduced to a single one.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.