Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine

Lili Guo,Longbiao Wang,Zhilei Liu,Haotian Guan,Jianwu Dang

doi:10.1109/access.2019.2921390

Lili Guo, Longbiao Wang + Show 3 more

Open Access

https://doi.org/10.1109/access.2019.2921390

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 44	License type: cc-by-nc-nd

Affiliation: Tianjin University

Abstract

Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general representation, but they do not emphasize human knowledge sufficiently. However, the use of these two kinds of features together can complement some features that may be unconsciously used by humans in daily life but have not been realized yet. Based on this consideration, this paper proposes a dynamic fusion framework to utilize the potential advantages of the complementary spectrogram-based statistical features and the auditory-based empirical features. In addition, a kernel extreme learning machine (KELM) is adopted as the classifier to distinguish emotions. To validate the proposed framework, we conduct experiments on two public emotional databases, including Emo-DB and IEMOCAP databases. The experimental results demonstrate that the proposed fusion framework significantly outperforms the existing state-of-the-art methods. The results also show that the proposed method, by integrating the auditory-based features with spectrogram-based features, could achieve a notably improved performance over the conventional methods.

Full Text