Abstract

The goal of this research was to develop a system that will automatically measure changes in the emotional state of a speaker, by analyzing his/her voice. Natural (non-acted) human speech of 77 (Dutch) speakers was collected and manually splitted into speech units. Three recordings per speaker were collected, in which he/she was in a positive, neutral and negative state. For each recording, the speakers rated 16 emotional states on a 10-point Likert Scale. The Random Forest algorithm was applied to 207 speech features that were extracted from recordings to qualify (classification) and quantify (regression) the changes in speaker’s emotional state. Results showed that predicting the direction of change of emotions and the change of intensity, measured by Mean Squared Error, can be done better than the baseline (the mean value of change). Moreover, it turned out that changes in negative emotions are more predictable than changes in positive emotions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call