Gauging the accuracy of automatic speech data harvesting in five under-resourced languages

Jaco Badenhorst,Febe De Wet

doi:10.55492/dhasa.v4i02.4031

Abstract

Recent research on deep-learning architectures has resulted in substantial improvements in automatic speech recognition accuracy. The leaps of progress made in well-resourced languages can be attributed to the fact that these architectures are able to effectively represent spoken language in all its diversity and complexity. However, developing advanced models of a language without appropriate corpora of speech and text data remains a challenge. For many under-resourced languages, including those spoken in South Africa, such resources simply do not exist. The aim of the work reported on in this paper is to address this situation by investigating the possibility to create diverse speech resources from unannotated broadcast data. The paper describes how existing speech and text resources were used to develop a semi-automatic data harvesting procedure for two genres of broadcast data, namely news bulletins and radio dramas. It was found that adapting acoustic models with less than 10 hours of manually annotated data from the same domain significantly reduced transcription error rates for speaking styles and acoustic conditions that are not represented in any of the existing speech corpora. Results also indicated that much more automatically transcribed adaptation data is required to achieve similar results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gauging the accuracy of automatic speech data harvesting in five under-resourced languages

Abstract

Talk to us

Similar Papers

More From: Journal of the Digital Humanities Association of Southern Africa (DHASA)

Lead the way for us

Journal: Journal of the Digital Humanities Association of Southern Africa (DHASA)	Publication Date: Dec 31, 2022
License type: cc-by-sa

Similar Papers

A New Corpus of Elderly Japanese Speech for Acoustic Modeling, and a Preliminary Investigation of Dialect-Dependent Speech Recognition
Meiko Fukuda ... Norihide Kitaoka
-
Meiko Fukuda, et. al.Meiko Fukuda ... Norihide Kitaoka
01 Oct 2019
01 Oct 2019

Towards automatic cross-lingual acoustic modelling applied to HMM-based speech synthesis for under-resourced languages
Tadej Justin ... Janezassoc Prof Žibert
Automatika | VOL. 57
Tadej Justin, et. al.Tadej Justin ... Janezassoc Prof Žibert
01 Jan 2015
Automatika | VOL. 57

Text spotting in large speech databases for under-resourced languages
Andi Buzo ... Horia Cucu
-
Andi Buzo, et. al.Andi Buzo ... Horia Cucu
01 Oct 2013
01 Oct 2013

A Corpus for Amharic-English Speech Translation: The Case of Tourism Domain
Michael Melese Woldeyohannis ... Million Meshesha
-
Michael Melese Woldeyohannis, et. al.Michael Melese Woldeyohannis ... Million Meshesha
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gauging the accuracy of automatic speech data harvesting in five under-resourced languages

Abstract

Talk to us

Similar Papers

More From: Journal of the Digital Humanities Association of Southern Africa (DHASA)