Abstract

In human-human interactions, speakers communicate their emotional state to the listeners using a combination of linguistic and paralinguistic cues. Although comprehension of paralinguistic cues plays a substantial role in improved interaction, limited works have explored this concerning a social robot. This work presents a speech emotion recognition system based on paralinguistic cues evaluated by emotion cards and emotional storytelling. An SVM classification model was trained to recognize 8 emotions (Neutral, Calm, Happy, Sad, Angry, Fearful, Disgust, Surprise) in a robot-directed speech exclusively based on vocalizations. Three simulated speech emotion datasets, RAVDESS, TESS, and RRLabSED contributed to 5972 training samples. Feature extraction and feature selection were carried out using the openSMILE toolkit and reliefF algorithm, respectively. To validate the system using real-time data, speech emotion samples from 15 participants were gathered through two tasks (Task 1 and Task 2). Experimental results report a balanced accuracy score of 66.43% and 69.63% for prediction of speech emotion, respectively. Finally, an HRI scenario was realized where the robot adapted its behavior as per the recognized human speech emotion. The ratings on Robot's Perceived Empathy scale, support the robot depicted as capable of empathy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call