Abstract

The aim of the present study involving automatic phonetic classification of /e/ and /u/ tokens in Tol is two-fold: first, I test existing claims about allophonic variation within these vowel classes, and second, I investigate allophonic variation within these vowel classes that has yet to be documented. The acoustic phonetic classifications derived in the present study contribute to a more detailed understanding of the allophonic systems operating within the Tol language. Operationalizing machine learning algorithms to investigate under-resourced, indigenous languages has the potential to provide detailed insights into the acoustic phonetic dynamics of a diverse range of vocalic systems.

Highlights

  • The use of machine learning algorithms for linguistic research has gained traction more broadly in recent years, including in the subfields of semantics (Liang and Potts 2015, Potts 2019, Boleda 2020), phonology (Linzen 2019, Pater 2019, Rawski and Heinz 2019), and dialectology (Hartley 2005, Evanini 2008)

  • The current study extends these recent acoustic inquiries to examining applications of machine learning algorithms to describing the acoustic characteristics of allophonic systems operating within an under-resourced, endangered language spoken in Central America

  • Machine learning technologies have been applied to data from endangered languages primarily for purposes of automatic speech recognition in the past (Besacier, Barnard, Karpov, and Schultz 2014; Rey and Nagy 2018; Mohammed 2020), the current study aims to use a machine learning algorithm to generate clusters of acoustic similarity for vocalic productions to explore patterns of phonologically-conditioned allophonic splits in Tol

Read more

Summary

Introduction

The use of machine learning algorithms for linguistic research has gained traction more broadly in recent years, including in the subfields of semantics (Liang and Potts 2015, Potts 2019, Boleda 2020), phonology (Linzen 2019, Pater 2019, Rawski and Heinz 2019), and dialectology (Hartley 2005, Evanini 2008). The current study extends these recent acoustic inquiries to examining applications of machine learning algorithms to describing the acoustic characteristics of allophonic systems operating within an under-resourced, endangered language spoken in Central America. Machine learning technologies have been applied to data from endangered languages primarily for purposes of automatic speech recognition in the past (Besacier, Barnard, Karpov, and Schultz 2014; Rey and Nagy 2018; Mohammed 2020), the current study aims to use a machine learning algorithm to generate clusters of acoustic similarity for vocalic productions to explore patterns of phonologically-conditioned allophonic splits in Tol. Tol, a Hokan language spoken by around 500 indigenous Tolupan people living on a reservation in south-central Honduras near Tegucigalpa, has been impressionistically described by several researchers, including Fleming and Dennis (1977) and Holt (1999). The current study set out to examine two particular claims about vocalic allophones that appeared in both works using acoustic data from the a largescale corpus designed for research on phonetic typology, Vox Clamantis (Salesky, Chodroff, Pimentel, Wiesner, Cotterell, Black, and Eisner 2020)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call