Abstract
In this paper, a speech emotion recognition method is proposed based on wavelet analysis on decomposed speech data obtained via empirical mode decomposition (EMD). Instead of analyzing the given speech signal directly, first the intrinsic mode functions (IMFs) are extracted by using the EMD and then the discrete wavelet transform (DWT) is performed only on the selected dominant IMFs. Both approximate and detail DWT coefficients of the dominant IMF are taken into consideration. It is found that some higher order statistics of these EMD-DWT coefficients corresponding to different emotions exhibit distinguishing characteristics and these statistical parameters are chosen as the desired features. For the purpose of classification, K nearest neighbor (KNN) classifier is employed along with the hierarchical clustering. Extensive simulations are carried out on widely used EMO-DB speech emotion database containing four class emotions, namely angry, happy, sad and neutral. Simulation results show that the proposed EMD-Wavelet based feature can provide quite satisfactory recognition performance with reduced feature dimension.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.