Using hybridization networks to retrace the evolution of Indo-European languages.

Matthieu Willems,Louise Laforest,Etienne Lord,François-Joseph Lapointe,Anna Maria Di Sciullo,Gilbert Labelle,Vladimir Makarenkov

doi:10.1186/s12862-016-0745-6

Matthieu Willems, Louise Laforest + Show 5 more

Open Access

https://doi.org/10.1186/s12862-016-0745-6

Copy DOI

Abstract

BackgroundCurious parallels between the processes of species and language evolution have been observed by many researchers. Retracing the evolution of Indo-European (IE) languages remains one of the most intriguing intellectual challenges in historical linguistics. Most of the IE language studies use the traditional phylogenetic tree model to represent the evolution of natural languages, thus not taking into account reticulate evolutionary events, such as language hybridization and word borrowing which can be associated with species hybridization and horizontal gene transfer, respectively. More recently, implicit evolutionary networks, such as split graphs and minimal lateral networks, have been used to account for reticulate evolution in linguistics.ResultsStriking parallels existing between the evolution of species and natural languages allowed us to apply three computational biology methods for reconstruction of phylogenetic networks to model the evolution of IE languages. We show how the transfer of methods between the two disciplines can be achieved, making necessary methodological adaptations. Considering basic vocabulary data from the well-known Dyen’s lexical database, which contains word forms in 84 IE languages for the meanings of a 200-meaning Swadesh list, we adapt a recently developed computational biology algorithm for building explicit hybridization networks to study the evolution of IE languages and compare our findings to the results provided by the split graph and galled network methods.ConclusionWe conclude that explicit phylogenetic networks can be successfully used to identify donors and recipients of lexical material as well as the degree of influence of each donor language on the corresponding recipient languages. We show that our algorithm is well suited to detect reticulate relationships among languages, and present some historical and linguistic justification for the results obtained. Our findings could be further refined if relevant syntactic, phonological and morphological data could be analyzed along with the available lexical data.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-016-0745-6) contains supplementary material, which is available to authorized users.

Highlights

Curious parallels between the processes of species and language evolution have been observed by many researchers
E.g., when we found that Old Armenian is a lexical hybrid of Old Persian and Old Greek, we should interpret the results of our algorithm as the identification of the influence, e.g., cultural, political or military, which the two parent languages had on their lexical hybrid, at possibly different periods of time, and which could last over several centuries
We present the most important reticulation events characterizing the evolution of IE languages which were identified by the three competing algorithms for inferring split graphs, galled networks and our explicit hybridization networks, respectively

Summary

Introduction

Curious parallels between the processes of species and language evolution have been observed by many researchers. Most of the IE language studies use the traditional phylogenetic tree model to represent the evolution of natural languages, not taking into account reticulate evolutionary events, such as language hybridization and word borrowing which can be associated with species hybridization and horizontal gene transfer, respectively. Implicit evolutionary networks, such as split graphs and minimal lateral networks, have been used to account for reticulate evolution in linguistics. Many curious similarities between the processes of species and language evolution have been observed since Darwin’s The Descent of Man [1]. The latter study compares the Willems et al BMC Evolutionary Biology (2016) 16:180 biology methods for studying the evolution of species, and in particular reticulate evolution, in the field of linguistics. The existing phylogenetic algorithms should be modified and workflows adapted in order to obtain meaningful linguistic results and interpretations

Methods

Results

Conclusion