Abstract
AbstractUnderstanding and recognizing emotional states through speech has vast implications in areas ranging from customer service to mental health. In this paper, we investigate the relationship between adults and children for the task of automatic speech emotion recognition, focusing on the critical issue of limited datasets for children’s emotions. We use two databases: IEMOCAP, which contains emotional speech recordings from adults, and AIBO, which includes recordings from children. To address the dataset limitations, we employ transfer learning by training a neural network to classify adult emotional speech using a Wav2Vec model for feature extraction, followed by a classification head for the downstream task. However, the labels between IEMOCAP and AIBO do not align perfectly, presenting a challenge in emotional mapping. To tackle this, we perform inference on children’s data to examine how emotional labels in IEMOCAP correspond to those in AIBO, highlighting the complexities of cross-age emotional transfer. This approach achieved F-scores of up to 0.47. In addition, we trained male and female IEMOCAP models to determine how variations in gender within adult speech affect emotional mapping in children data. Some of our findings indicate that female samples align more with high arousal emotions, while male samples align more with low arousal emotion, underscoring the importance of gender in emotion recognition. To the best of our knowledge, this is the first study in the field of deep learning applications on emotion recognition that analyses the effects of genders and age groups on emotional mapping.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.