Do Minority Views in Perceptual Evaluations Affect Confidence of Speech Emotion Classifiers?

Huang-Cheng Chou

doi:10.1109/aciiw57231.2022.10085994

Abstract

Emotion recognition plays an essential role in affective computing. Machines can predict emotional clues by capturing human behaviors. Speech emotion recognition (SER) is one of the critical technologies to have a machine recognize emotion from speech. Most existing SER models rely on emotional annotations from human perceptual evaluations. However, emotion perception is subjective due to different emotional experiences, backgrounds, and cultures, so observing disagreement among annotations from human perceptual evaluations is customary. The most common way to decide on ground truths is using a majority vote or plurality rule. Previous studies utilize these methods to generate the ground truths of test data, but they discard some of the test data if these data have no consensus label. However, in this work, we keep all annotations to calculate their frequencies and generate emotion ground-truth using the thresholding method. The most important difference from previous works on SER is that we define the SER task as a multi-label task. Each data point is allowed to have one or more than one emotion. After defining the ground truth of the complete test set, we explore whether removing minority annotations affects SER systems' confidence. We use calibration error metrics to measure the accuracy and confidence of predictions from speech emotion classifiers. We plan to investigate two research questions: (1) Which label learning methods (e.g., hard-label, soft-label, multi-label, or distribution-label learning methods) can have better well-calibrated classifiers without applying any calibration methods? (2) Can predicting the agreements among annotators on sentence-level annotations improve the calibration of speech emotion classifiers? In my preliminary experiments, we use the distribution-label learning method without discarding any annotations to train SER systems to answer the second question at first. We evaluate the preliminary experiments on the MSP-PODCAST corpus and show the results in the various evaluation metrics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Do Minority Views in Perceptual Evaluations Affect Confidence of Speech Emotion Classifiers?

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Investigation of multilingual and mixed-lingual emotion recognition using enhanced cues with data augmentation
S Lalitha ... Yousef Ajami Alotaibi
Applied Acoustics | VOL. 170
S Lalitha, et. al.S Lalitha ... Yousef Ajami Alotaibi
22 Jul 2020
Applied Acoustics | VOL. 170

Emotional speech Recognition using CNN and Deep learning techniques
C Hema ... Fausto Pedro Garcia Marquez
Applied Acoustics | VOL. 211
C Hema, et. al.C Hema ... Fausto Pedro Garcia Marquez
28 Jun 2023
Applied Acoustics | VOL. 211

Speech emotion recognition systems and their security aspects
Itzik Gurowiec ... Nir Nissim
Artificial Intelligence Review | VOL. 57
Itzik Gurowiec, et. al.Itzik Gurowiec ... Nir Nissim
21 May 2024
Artificial Intelligence Review | VOL. 57

Multiple Models Fusion for Multi-label Classification in Speech Emotion Recognition Systems
Anwer Slimi ... Henri Nicolas
Procedia Computer Science | VOL. 207
Anwer Slimi, et. al.Anwer Slimi ... Henri Nicolas
01 Jan 2021
Procedia Computer Science | VOL. 207

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Do Minority Views in Perceptual Evaluations Affect Confidence of Speech Emotion Classifiers?

Abstract

Talk to us

Similar Papers