Arabic Speech Recognition by Stationary Bionic Wavelet Transform and MFCC Using a Multi-layer Perceptron for Voice Control

Talbi Mourad

doi:10.1007/978-3-030-93405-7_4

Abstract

In this chapter, we will detail our approach of Arabic speech recognition with mono-locutor and a reduced vocabulary, introduced in literature. This approach consists at the first step in using our proper speech database which constitutes Arabic speech words, recorded by a mono-locutor and this for a voice command. The second step consists in extracting features from those recorded words. The third step consists in classifying those extracted features. This extraction is performed by applying, at the first step, the stationary bionic wavelet transform (SBWT) to each recorded word. Then, the Mel Frequency Cepstral Coefficients (MFCCs) are calculated from the vector obtained from the concatenation of the obtained stationary bionic wavelet coefficients. The obtained MFCCs are then concatenated for constructing one input of a multi-layer perceptron (MLP), employed for the feature classification. In the phases of learning and testing the used MLP, we have used ten Arabic words, and each of them is repeated 25 times by the same voice. A simulation program employed for testing the performance of the proposed approach showed a classification rate equals to 98%.KeywordsArabic speech recognitionFeature extractionMel Frequency Cepstral Coefficients (MFCC)Stationary bionic wavelet transform (SBWT)Multi-layer perceptron (MLP)

Full Text