Given the increasing need for interactive human-computer applications, the field of employing machine learning algorithms to discern emotions from speech has seen a substantial surge in interest. While emotion recognition systems have made substantial progress in languages like German, English, Spanish, Dutch, and Danish, the availability of comprehensive datasets for the Kurdish language remains notably limited. This paper addresses this gap by focusing on emotion recognition in Sorani Kurdish dialect speech data, which was carefully gathered from openly available videos from the YouTube platform and categorized into four clear supposed emotions: neutral, sadness, happiness, and anger. The study applied both natural Mel Spectrogram and Mel-Frequency Cepstral Coefficient (MFCC) features for various spectrals, followed by the classification models K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), and Support Vector Machine (SVM) to evaluate the results. By closely examining and contrasting the results of using several methods for feature extraction, it was found that SVM obtained a higher accuracy, reaching as much as 85.57%. This is so much more than the first Kurdish emotion classification technique for the recognition of the emotion of the words.
Read full abstract