Abstract

Recognition of expressions from speech has emerged as an important research area in the recent past. However, the scientific community still faces problems in differentiating between angry and lombard speech. The objective of this work is to analyze the differences between the Lombard and angry speech using the features representing the excitation source of speech production. The instantaneous fundamental frequency, the strength of excitation and loudness measure, reflecting the sharpness of the impulse-like excitation around the epochs are used as excitation source features. The distributions curves of these three parameters are next plotted. We employ the concept of Gaussian Mixture Models (GMMs) and KL divergence (a measure of relative entropy) to calculate an exact measure of difference between angry, lombard and neutral speech with context to the aforementioned parameters and successfully show differences among the Lombard and angry speech signals at the excitation source level.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call