Abstract
Speech emotion recognition is one of the most active areas of research in the field of affective computing and social signal processing. However, most research is directed towards a select group of languages such as English, German, and French. This is mainly due to a lack of available datasets in other languages. Such languages are called low-resource languages given that there is a scarcity of publicly available datasets. In the recent past, there has been a concerted effort within the research community to create and introduce datasets for emotion recognition for low-resource languages. To this end, we introduce in this paper the Urdu-Sindhi Speech Emotion Corpus, a novel dataset consisting of 1,435 speech recordings for two widely spoken languages of South Asia, that is Urdu and Sindhi. Furthermore, we also trained machine learning models to establish a baseline for classification performance, with accuracy being measured in terms of unweighted average recall (UAR). We report that the best performing model for Urdu language achieves a UAR = 65.00% on the validation partition and a UAR = 56.96% on the test partition. Meanwhile, the model for Sindhi language achieved UARs of 66.50% and 55.29% on the validation and test partitions, respectively. This classification performance is considerably better than the chance level UAR of 16.67%. The dataset can be accessed via https://zenodo.org/record/3685274.
Highlights
According to the Oxford dictionary 1, the word emotion is defined as a strong feeling such as love, fear, or anger; the part of a person’s character that consists of feelings
We introduce a novel speech emotion dataset consisting of 1,435 audio recordings which can be used to train machine learning models for speech-based emotion recognition in two South Asian languages, namely Urdu and Sindhi
The rest of the paper is organized as follows: In section II we introduce the methodology for collection of Urdu-Sindhi Speech Emotion Corpus whereas in section III we detail the methodology for establishing the baseline classification performance for the dataset
Summary
According to the Oxford dictionary 1, the word emotion is defined as a strong feeling such as love, fear, or anger; the part of a person’s character that consists of feelings. In research literature from the field of psychology, one finds that there is no consensus on a definition of emotion. According to [1] an emotion is any mental experience with high intensity and high hedonic content (pleasure/displeasure). [2] defines emotion as a complex psychological event that involves a mixture of reactions: 1) a physiological response, 2) an expressive reaction (distinctive facial expression, body posture, or vocalization), and 3) some kind of subjective experience (internal thoughts and feelings). Expression of feelings and by extension emotions is a fundamental part of human behavior. Emotions play an important role in how one thinks and behaves which means that analysis of emotions exhibited by individuals can be used to gain insights into their thought process
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Advanced Computer Science and Applications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.