Abstract
Abstract Computational approaches to historical linguistics have been proposed for half a century. Within the last decade, this line of research has received a major boost, owing both to the transfer of ideas and software from computational biology and to the release of several large electronic data resources suitable for systematic comparative work. In this article, some of the central research topics of this new wave of computational historical linguistics are introduced and discussed. These are automatic assessment of genetic relatedness, automatic cognate detection, phylogenetic inference and ancestral state reconstruction. They will be demonstrated by means of a case study of automatically reconstructing a Proto-Romance word list from lexical data of 50 modern Romance languages and dialects. The results illustrate both the strengths and the weaknesses of the current state of the art of automating the comparative method.
Highlights
Historical linguistics is the oldest sub-discipline of linguistics, and it constitutes an amazing success story
The success of historical linguistics is owed to a large degree to a collection of very stringent methodological principles that go by the name of the comparative method (Meillet 1954; Weiss 2015)
A final step toward the reconstruction of Proto-Romance forms, Ancestral State Reconstruction is performed for the sound classes in each column, for each multiple sequence alignment (MSA) obtained in the previous step
Summary
Historical linguistics is the oldest sub-discipline of linguistics, and it constitutes an amazing success story. The success of historical linguistics is owed to a large degree to a collection of very stringent methodological principles that go by the name of the comparative method (Meillet 1954; Weiss 2015). It can be summarized by the following workflow (from Ross and Durie 1996: 6–7):. While the mentioned proposals mostly constitute isolated efforts of historical and computational linguists, the emerging field of computational historical linguistics received a major impetus since the early 2000s by the work of computational biologists such as Alexandre Bouchard-Côté, Russell Gray, Robert McMahon, Mark Pagel or Tandy Warnow and co-workers, who applied methods from their field to the problem of the reconstruction of language history, often in collaboration with linguists. The focus of this article is on computational work inspired by the comparative method, so this line of work will not further be covered here
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have