Abstract

Speech recognition in noisy environments is one of the long-standing research themes but remains a very important challenge nowadays. Therefore, there is much research into all techniques and approaches to improve the performance of speech recognition systems, even in poor conditions. This paper presents a comparative study under various conditions based on two architectures (GMM-HMM and DNN-HMM), the Hybrid GMM-HMM models using the CMU Sphinx tools and the Hybrid DNN-HMM using the KALDI toolkit in noise environment. In this study, we compare the Hybrid GMM-HMM models and the Hybrid DNN-HMM models to evaluate the performance of the proposed system. The novelty of this paper is to test if the presented tools could be, with good accuracy, recognize the Arabic speech principally in noisy environment. In addition, we adopted the noisy training theory in this paper based on GMM-HMM and DNN-HMM model. We use the public Arabic Speech Corpus for Isolated Words (20 words), three noise levels, and three noise types. The implementation of our system consists of two phases: Features extraction using Mel-frequency Cepstral Coefficient (MFCC) and the classification phase will use separately the previous two models. In order to test the performance of these methods a simulation will presented for different SNR and for different district type of noise.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call