Abstract

The paper addresses the research question of automatic emotional speech recognition for Serbian. It integrates two research issues: (i) selection of an appropriate feature set, and (ii) investigation of different classification techniques. The paper reports a set of experiments with three feature sets: (i) the prosodic feature set, (ii) the spectral feature set, and (iii) the set of both spectral and prosodic features. The linear Bayes, the perceptron rule and the kNN classifier were considered in all three experiments. The experimental results show that the highest recognition accuracy of 91.5 % was obtained with the third feature set using the linear Bayes classifier. DOI: http://dx.doi.org/10.5755/j01.eee.18.9.2806

Highlights

  • Recognition of emotional speech in human-machine interaction is a challenging task

  • Even in cases when users' emotional state does not lead to the introduction of additional lexical information, changes in the acoustic features of affective speech may significantly degrade the accuracy of automatic speech recognition (ASR)

  • We recall that our research integrates two research directions: (i) selection of a feature set, and (ii) investigation of different techniques for classification of emotional speech

Read more

Summary

Introduction

Recognition of emotional speech in human-machine interaction is a challenging task. Even in cases when users' emotional state does not lead to the introduction of additional lexical information (e.g., out-of-vocabulary words, etc.), changes in the acoustic features of affective speech may significantly degrade the accuracy of automatic speech recognition (ASR). Taking into account the changes in acoustic features that indicate emotion may substantially improve human-machine speech-based interfaces. This does not hold only from the aspect of ASR, and from other functional aspects of dialogue systems (e.g., natural-sounding text-to-speech synthesis). In the scope of customer care interactions (engaging a human operator or a conversational agent), emotional speech re Manuscript received March 12, 2012; accepted May 12, 2012. The presented study is performed as part of the project “Development of Dialogue Systems for Serbian and Other South Slavic Languages” (TR32035), as well as projects III44008 and OI178027, funded by the Ministry of Education and Science of the Republic of Serbia

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call