Abstract

Research on speaker and accent recognition studies using the Malay language in the field of Automatic Speech Recognition (ASR) is limited, with most studies focusing on speech recognition. This study proposes to increase the performance Malaysian speakers and accent recognition using wavelets transform, namely Wavelet Packet Transform (WPT) and Dual-Tree Complex Wavelet Packet Transform (DT-CWPT). A variety of feature extraction combinations, including conventional Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coding (LPC) and wavelets transform, were implemented to compare the effectiveness of the proposed method. Although the proposed approach resulted in improved detection rate, it faced challenges in terms of high feature dimensionality and increased computation time. To address these issues, the Genetic Algorithm (GA) approach has been adopted to reduce the number of irrelevant features, accelerate the learning system and achieve better performance. The extracted features were trained using various classifiers, including k-Nearest Neighbors (k-NN), Support Vector Machine (SVM) and Extreme Learning Machine (ELM). The experimental results showed that the best speaker recognition accuracy was 97.33% for English numbers using SVM classifier and 96.02% for Malay words using the ELM classifier with a combination of wavelets, LPC and MFCC features. For accent recognition, the ELM classifier yielded the best performance, achieving 95.28% accuracy for English numbers with a combination of wavelets and MFCC features and 96.72% for Malay words using combined feature extraction of wavelets, LPC and MFCC feature extraction. It can be concluded that Malay words yielded better recognition rates than English numbers. Furthermore, use of GA effectively reduced the overall number of features while maintaining high accuracy level.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call