The amalgamation of wavelet packet information gain entropy tuned source and system parameters for improved speech emotion recognition

Hemanta Kumar Palo,Swapna Subudhiray,Niva Das

doi:10.1016/j.specom.2023.03.007

Hemanta Kumar Palo, Swapna Subudhiray + Show 1 more

https://doi.org/10.1016/j.specom.2023.03.007

Copy DOI

Abstract

This paper proposes a three-stage feature selection algorithm by exploring the wavelet packet (WP) decomposition, statistical, and Information Gain (IG) feature ranking algorithm based on high IG entropy for the automatic classification of speech emotions (SE). Effective frame-level vocal tract systems and the excitation source parameters are initially extracted from a few significant WP sub-bands containing emotionally relevant information based on Eigenvalue decomposition. Further, Several intelligent amalgamations of the derived optimal feature vectors are formed for improved performance in a low-dimensional feature space. The fundamental argument is that in case the identification errors of any system subjected to individual feature streams transpire at separate points, there exists a probability that the inclusion of complementary information can nullify a few of these inaccuracies by increasing the available information. The models of the Cost-Sensitive-Decision Tree (CS-DT), Support Vector Machine (SVM), and Decision Tree (DT) have been validated and tested with the proposed setup for their efficacy. Results indicate the superiority of the proposed algorithms compared to other published articles cited in the literature with the CS-DT outperforming others. The proposed low-dimensional amalgamation vectors have witnessed more than 20% improvement in recognition performance with greater speed of response, savings in cost, and lower F- rank hence is a significant achievement in this direction.The link to compute the Hurst parameter for the feature amalgamation is available at (https://www.mathworks.com/matlabcentral/fileexchange/9842-hurst-exponent).

Full Text