Abstract

In this paper, a speech emotion recognition method is proposed based on wavelet analysis on decomposed speech data obtained via empirical mode decomposition (EMD). Instead of analyzing the given speech signal directly, first the intrinsic mode functions (IMFs) are extracted by using the EMD and then the discrete wavelet transform (DWT) is performed only on the selected dominant IMFs. Both approximate and detail DWT coefficients of the dominant IMF are taken into consideration. It is found that some higher order statistics of these EMD-DWT coefficients corresponding to different emotions exhibit distinguishing characteristics and these statistical parameters are chosen as the desired features. For the purpose of classification, K nearest neighbor (KNN) classifier is employed along with the hierarchical clustering. Extensive simulations are carried out on widely used EMO-DB speech emotion database containing four class emotions, namely angry, happy, sad and neutral. Simulation results show that the proposed EMD-Wavelet based feature can provide quite satisfactory recognition performance with reduced feature dimension.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call