Speech Recognition is one of the important methods to control robotics, mobile applications, automobile, etc. Nevertheless, still, the Speech recognition rate needs to be improved especially in noisy environments. The main goal of this paper is to investigate the best features for isolated word recognition between the most common features extraction technique Mel Frequency Cepstral Coefficient MFCC and noise immunity features extraction technique Power Normalized Cepstral Coefficient PNCC, with the aid of four classifiers, Weighted KNN, Medium KNN, Linear SVM, Quadratic SVM in different levels of noise. The database that is used for this study is Speech Commands Data Set v0.01 which is containing 64,727 audio files to get a good estimation of the recognition rate. The algorithm based on MFCC is recommended for Speaker dependent speech recognition a little bit better (1.1%) than the algorithm based on PNCC in a noise-free environment. Nevertheless, the proposed algorithm based on PNCC is more efficient than the algorithm based on MFCC for Speaker independent speech recognition, and overall Recognition accuracy. Moreover, it has more immunity to noise and it is more accurate than MFCC by 8.7% on average with already trained speakers and 18.528% on average with new trained speakers’ tests. Finally, the proposed algorithm based on the PNCC feature extraction with aid of Weighted KNN is more accurate than other algorithms for Speaker independent speech recognition and overall Recognition accuracy. Moreover, it has more immunity to noise.
Read full abstract