Abstract

In current practice, when dating the root of a Bayesian language phylogeny the researcher is required to supply some of the information beforehand, including a distribution of root ages and dates for some nodes serving as calibration points. In addition to the potential subjectivity that this leaves room for, the problem arises that for many of the language families of the world there are no available internal calibration points. Here we address the following questions: Can a new Bayesian framework which overcomes these problems be introduced and how well does it perform? The new framework that we present is generalized in the sense that no family-specific priors or calibration points are needed. We moreover introduce a way to overcome another potential source of subjectivity in Bayesian tree inference as commonly practiced, namely that of manual cognate identification; instead, we apply an automated approach. Dates are obtained by fitting a Gamma regression model to tree lengths and known time depths for 30 phylogenetically independent calibration points. This model is used to predict the time depths of both the root and the internal nodes for 116 language families, producing a total of 1,287 dates for families and subgroups. It turns out that results are similar to those of published Bayesian studies of individual language families. The performance of the method is compared to automated glottochronology, which is an update of the classical method of Swadesh drawing upon automated cognate recognition and a new formula for deriving a time depth from percentages of shared cognates. It is also compared to a third dating method, that of the Automated Similarity Judgment Program (ASJP). In terms of errors and correlations with known dates, ASJP works better than the new method and both work better than automated glottochronology.

Highlights

  • The assignment of age to a proto-language has long been a desideratum in historical linguistics, and with [1] and in subsequent works by Swadesh a quantitative method was developed based on the hypothesis that the replacement of core lexical items is approximately constant over time

  • In this paper we have introduced a new method called Generalized Bayesian Dating (GBD) for inferring dates of language groups from lexical and phonological data

  • It was tested against Automated Similarity Judgment Program (ASJP) chronology and glottochronology

Read more

Summary

Introduction

The assignment of age to a proto-language has long been a desideratum in historical linguistics, and with [1] and in subsequent works by Swadesh a quantitative method was developed based on the hypothesis that the replacement of core lexical items is approximately constant over time. This method has been criticized extensively, mainly through examples showing that. A test of Generalized Bayesian dating from Beijing Language Innovation Center in support of a sub-topic directed by Qibin Ran. The sponsors or funders did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.