A Modified MFCC for Improved Wavelet-Based Denoising on Robust Speech Recognition

Risanuri Hidayat,Anggun Winursito

doi:10.22266/ijies2021.0228.02

Abstract

Research on the current speech recognition system leads to the creation of a noise-resistant system. The Mel Frequency Cepstral Coefficients (MFCC) extraction method becomes a popular method in the speech recognition system. In this paper, the MFCC's weakness of noise interference is the main reason underlies the accomplishment of a robust speech recognition system. Development was carried out by improving the denoising performance using a wavelet transform. Modifications were carried out by analyzing the weakness of the wavelet denoising process on the recognition system using the MFCC method. The analysis was conducted at one of the MFCC stages, the Fast Fourier Transform (FFT) stage. The proposed method was conducted by performing the denoising process using Wavelet only on the noise-related data based on the FFT process' analysis results. The study utilized speech data in the form of eleven isolated words in English added with noise with several different characteristics. Results showed that the proposed method was capable of generating a better accuracy than conventional wavelet denoising methods on the signal to noise ratio (SNR) of 10dB, 15dB, and 20dB using a Fejer Korovkin 6 wavelet type. The highest accuracy increase of the proposed method was in signal to noise ratio (SNR) of 15dB with a rise of 4.63%, followed by a 3.96% increase at 20dB intensity, and 2.3% at 10dB intensity. The performance of the proposed method is then compared with other methods. The results show that the proposed method has the best performance on clean speech and noisy speech at SNR intensities of 10dB, 15dB, and 20dB.

Full Text