Abstract

This study reports experimental results on whether the acoustic realization of vocal emotions differs between Mandarin and English. Prosodic cues, spectral cues and articulatory cues generated by electroglottograph (EGG) of five emotions (anger, fear, happiness, sadness and neutral) were compared within and across Mandarin and English through a production experiment. Results of within-language comparison demonstrated that each vocal emotion had specific acoustic patterns in each language. Moreover, normalized data were used in the across-language comparison analysis. Results indicated that Mandarin and English showed different mechanisms of utilizing pitch for encoding emotions. The differences in pitch variation between neutral and other emotions were significantly larger in English than in Mandarin. However, the variations of speech rate and certain phonation cues (e.g., CPP (Cepstral Peak Prominence) and CQ (Contact quotient)) were significantly greater in Mandarin than in English. The differences in emotional speech between the two languages may be due to the restriction of pitch variation by the presence of lexical tones in Mandarin. This study reveals an interesting finding that occurs when a certain cue (e.g., pitch) is restricted in one language, other cues were strengthened to take on the responsibility of differentiating vocal emotions. Therefore, we posit that the acoustic realizations of emotional speech are multidimensional.

Highlights

  • IntroductionHuman speech is believed to be a reliable means of conveying a speaker’s emotional state [1]

  • Human speech is believed to be a reliable means of conveying a speaker’s emotional state [1].In the past, researchers looked for acoustic cues that could clearly distinguish different vocal emotions (e.g., [2,3,4], among many others)—mostly anger, fear, happiness and sadness from a core set of basic emotions [5]

  • This study examined whether a tonal language shows restricted pitch variation when encoding

Read more

Summary

Introduction

Human speech is believed to be a reliable means of conveying a speaker’s emotional state [1]. With these issues in mind, the current study aims to improve the understanding of underlying mechanisms of vocal emotions in a tonal language—Mandarin—and a non-tonal language—English—through the collective analysis of prosodic cues, spectral cues and EGG cues. These two languages have been widely studied, which means our results can be compared to the existing literature By adopting this new approach, we are attempting to answer the following research questions: (a) whether Mandarin and English have different patterns with respect to prosodic cues, spectral cues and EGG cues in vocal expression of emotions;. The preliminary results of this study were presented at the 8th International Conference on Speech Prosody as shown in Reference [31]

Speech Materials
Subjects
Recording Procedure
Listening Tests
Measurements
Results
Within-Language Comparison of Vocal Emotions
Mandarin
Pitch-normalized
English
Pitch-normalizedEGG
Across-Language Comparison of Vocal Emotions
Prosodic of four emotionsininMandarin
Phonation
Sadness and fear reduced
Discussion and Conclusions
Acoustic andcues
Multidimensionality of the Acoustic Realization of Vocal Emotions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call