Automated extraction of prosodic features and analysis for emotional states.

Suk-Myung Lee,Jeung-Yoon Choi

doi:10.1121/1.3248874

Abstract

This study investigated the relationship between emotional states and prosody. A prosody detection algorithm [Choi et al., J. Acous. Soc. Am. 188, 2579–2587] was applied to extract accents and intonational boundaries automatically. The measurements used are derived from duration, pitch, harmonic structure, spectral tilt, and amplitude. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and intonational boundary detection. This algorithm was applied to a Korean emotional database subset, in which five sentences were spoken by 15 speakers over four emotions: neutral, joy, sadness, and anger. By comparing the ratio of events that were detected as accent and intonational boundaries between neutral speech and emotional speech, our experiments find different distributions of these events for each emotion. In preliminary experiments, joy and anger tended to have fewer events classified as boundaries compared to other emotions. Also, joy and sadness have more events corresponding to accents. These results indicate that prosody detection can be useful for classification of emotion.

Full Text