Abstract

Extracting the valuable information from the emotional speech is one of the major challenges in the areas of emotion recognition and human-machine interfaces. Most of the research in emotion recognition is based on the analysis of fundamental frequency, energy contour, duration of silence, formant, Mel-band energies, linear prediction cepstral coefficients, and Mel frequency cepstral coefficients. It was observed that emotion classification using sinusoidal features perform better as compared to the linear prediction and cepstral features. Harmonic models are considered as a variant of the sinusoidal model. In order to improve emotional speech classification rate and conversion of neutral speech to emotional speech, analysis using different harmonic features of emotional speech is a critical step. In this paper, investigations have been carried out using Berlin emotional speech database to analyze gender-based emotional speech using harmonic plus noise model (HNM) features and Gaussian mixture model (GMM). Analysis has been performed with the HNM features like pitch, harmonic amplitude, maximum voiced frequency and noise components. From the results, it can be observed that different emotional speech of male and female speakers can be represented with K components of GMM distribution. The optimal number of GMM components have been decided on the basis of Akaike information criterion (AIC) score.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call