Acoustic Nudging-Based Model for Vocabulary Reformulation in Continuous Yorùbá Speech Recognition

Lydia Kehinde Ajayi,Isaac Odun-Ayo,Ambrose Azeta,Enem Theophilus Aniemeka

doi:10.1007/978-3-031-10522-7_34

Abstract

AbstractSpeech recognition is a technology that aid processing of speech signals through communicating with computer applications. Previous studies exhibits speech recognition errors arising from users’ acoustic irrational behavior. This research paper provides acoustic nudging-based model for reformulating the persistence of automatic speech recognition errors that involve the user’s acoustic irrational behavior and distortion of speech recognition accuracy. Gaussian mixture model (GMM) helped in addressing the low-resourced attribute of Yorùbá language to achieve better accuracy and system performance. From the implemented results, it was observed that the proposed acoustic nudging-based model improves accuracy and system performance based on Word Error Rate (WER), validation, testing and training accuracy. The evaluation results for the mean WER was 4.723% when compared to existing models. This approach thereby reduces error rate when compared with previous models by GMM (1.1%), GMM-HMM (0.5%), CNN (0.8%), and DNN (1.4%). Therefore, this work was able to discover a foundation for advancing the current understanding of under-resourced languages and development of an accurate and precise model for speech recognition.KeywordsAcoustic nudging modelGaussian mixture modelAutomatic speech recognitionCommunication and nudging

Full Text