Abstract

Following Den Besten’s (2009) desiderata for historical linguistics of Afrikaans, this article aims to contribute some modern evidence to the debate regarding the founding dialects of Afrikaans. From an applied perspective (i.e. human language technology), we aim to determine which West Germanic language(s) and/or dialect(s) would be best suited for the purposes of recycling speech resources for the benefit of developing speech technologies for Afrikaans. Being recognised as a West Germanic language, Afrikaans is first compared to Standard Dutch, Standard Frisian and Standard German. Pronunciation distances are measured by means of Levenshtein distances. Afrikaans is found to be closest to Standard Dutch. Secondly, Afrikaans is compared to 361 Dutch dialectal varieties in the Netherlands and North-Belgium, using material from the Reeks Nederlandse Dialectatlassen , a series of dialect atlases compiled by Blancquaert and Pee in the period 1925-1982 which cover the Dutch dialect area. Afrikaans is found to be closest to the South-Holland dialectal variety of Zoetermeer; this largely agrees with the findings of Kloeke (1950). No speech resources are available for Zoetermeer, but such resources are available for Standard Dutch. Although the dialect of Zoetermeer is significantly closer to Afrikaans than Standard Dutch is, Standard Dutch speech resources might be a good substitute.

Highlights

  • The development of language resources for use in human language technologies (HLTs) is time-consuming, tedious and expensive, both in terms of human- and other resources

  • Given that we focus on acoustic data, we will attempt to quantify the relationship between the pronunciation of Afrikaans and other West Germanic languages (i.e. Standard Dutch, Standard Frisian and Standard German) and 361 Dutch dialects in terms of an acoustic distance measure

  • We will answer the first research question mentioned in section 1: Is Dutch, acoustically speaking, the closest West Germanic language to Afrikaans? In the same section, we found from literature that Afrikaans belongs to the West Germanic languages

Read more

Summary

Introduction

The development of language resources for use in human language technologies (HLTs) is time-consuming, tedious and expensive, both in terms of human- and other resources. Development can be accelerated if existing resources from closely-related languages can be used in one way or another. A popular theme in the fields of speech and language processing is to find innovative ways to expedite this process as cost effectively as possible, especially for so-called “resource scarce” languages (i.e. languages without sufficient annotated electronic data that would enable one to use statistical approaches to speech and language processing). Because HLT is still a relatively new field in South Africa, most of the South African languages are severely under-resourced in terms of the data and software required to develop HLT applications, such as automatic speech recognition engines, speech synthesis systems, etc. One “recycles” resources from one language for the benefit of another language, referring to this approach as a “recycling approach”

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call