Multilingual speech recognition for GlobalPhone languages

Martha Yifiru Tachbelie,Solomon Teferra Abate,Tanja Schultz

doi:10.1016/j.specom.2022.03.006

Martha Yifiru Tachbelie, Solomon Teferra Abate + Show 1 more

https://doi.org/10.1016/j.specom.2022.03.006

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In this paper, we present our investigations towards the development of multilingual automatic speech recognition (ML ASR) systems using the GlobalPhone database. In addition to GlobalPhone, we have included 4 Ethiopian languages (Amharic, Oromo, Tigrigna and Wolaytta), as well as Uyghur and English in our investigation. In order to see the impact of language relatedness in ML ASR training, we have analyzed both phonetic overlap and morphological complexity of the languages. Deep Neural Network based ML ASR systems have been developed using ML mix, transfer and multitask learning approaches. Relative word error rate (WER) reductions up to 33.21% have been achieved as a result of using resources of other languages in ML acoustic model training. Our experimental results show that languages with small amounts of monolingual training data benefit a lot from ML training. Moreover, using phonetically related languages in ML training is more beneficiary than using phonetically less related languages. It seems that the nature of the corpus (single or mixed domain, noisy or noise free, etc.) has also an impact in ML training although it is not as important as the phonetic relatedness of the languages.

Full Text