Abstract

In this paper, a novel multiscale amplitude feature is proposed using multiresolution analysis (MRA) and the significance of the vocal tract is investigated for emotion classification from the speech signal. MRA decomposes the speech signal into number of sub-band signals. The proposed feature is computed by using sinusoidal model on each sub-band signal. Different emotions have different impacts on the vocal tract. As a result, vocal tract responds in a unique way for each emotion. The vocal tract information is enhanced using pre-emphasis. Therefore, emotion information manifested in the vocal tract can be well exploited. This may help in improving the performance of emotion classification. Emotion recognition is performed using German emotional EMODB database, interactive emotional dyadic motion capture database, simulated stressed speech database, and FAU AIBO database with speech signal and speech with enhanced vocal tract information (SEVTI). The performance of the proposed multiscale amplitude feature is compared with three different types of features: 1) the mel frequency cepstral coefficients; 2) the Teager energy operator (TEO)-based feature (TEO-CB-Auto-Env); and 3) the breathinesss feature. The proposed feature outperforms the other features. In terms of recognition rates, the features derived from the SEVTI signal, give better performance compared to the features derived from the speech signal. Combination of the features with SEVTI signal shows average recognition rate of 86.7% using EMODB database.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.