Abstract

Chaha is a very low-resource language, which is suffered from lack of language resources to develop human language technologies, namely, speech recognition. Moreover, Chaha writing system is syllabic with a consonant-vowel (CV) syllable structure. The Chaha orthography is a one-to-one correspondence with syllable sound units. By considering the above facts of Chaha, this study is the first endeavor that explores the use of CV syllables as acoustic modeling units for developing speech recognizers, using the Gaussian mixture model (GMM) and unilingual and transfer learning deep neural network (DNN) models. Our experimental results demonstrate that the syllablebased unilingual DNN and transfer learning DNN models outperform the corresponding GMM and unilingual DNN models with absolute performance improvements of 2.8 to 3.09% and 1.07 to 4.94%, respectively. The best performing syllable-based recognizer is achieved using a shared hidden layer (SHL) time delay deep neural network (TDNN) model with a word error rate (WER) of 23.11%. Hence, the CV syllables are suitable acoustic units to develop Chaha speech recognition systems under sufficient training corpus.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.