Abstract

Automatic speech emotion recognition plays an important role in intelligent human computer interaction. Identifying emotion in natural, day to day, spontaneous conversational speech is difficult because most often the emotion expressed by the speaker are not necessarily as prominent as in acted speech. In this paper, we propose a novel spontaneous speech emotion recognition framework that makes use of the available knowledge. The framework is motivated by the observation that there is significant disagreement amongst human annotators when they annotate spontaneous speech; the disagreement largely reduces when they are provided with additional knowledge related to the conversation. The proposed framework makes use of the contexts (derived from linguistic contents) and the knowledge regarding the time lapse of the spoken utterances in the context of an audio call to reliably recognize the current emotion of the speaker in spontaneous audio conversations. Our experimental results demonstrate that there is a significant improvement in the performance of spontaneous speech emotion recognition using the proposed framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call