Feature-Enhanced Multi-Task Learning for Speech Emotion Recognition Using Decision Trees and LSTM

Chun Wang,Xizhong Shen

doi:10.3390/electronics13142689

Abstract

Speech emotion recognition (SER) plays an important role in human-computer interaction (HCI) technology and has a wide range of application scenarios in medical medicine, psychotherapy, and other applications. In recent years, with the development of deep learning, many researchers have combined feature extraction technology with deep learning technology to extract more discriminative emotional information. However, a single speech emotion classification task makes it difficult to effectively utilize feature information, resulting in feature redundancy. Therefore, this paper uses speech feature enhancement (SFE) as an auxiliary task to provide additional information for the SER task. This paper combines Long Short-Term Memory Networks (LSTM) with soft decision trees and proposes a multi-task learning framework based on a decision tree structure. Specifically, it trains the LSTM network by computing the distances of features at different leaf nodes in the soft decision tree, thereby achieving enhanced speech feature representation. The results show that the algorithm achieves 85.6% accuracy on the EMO-DB dataset and 81.3% accuracy on the CASIA dataset. This represents an improvement of 11.8% over the baseline on the EMO-DB dataset and 14.9% on the CASIA dataset, proving the effectiveness of the method. Additionally, we conducted cross-database experiments, real-time performance analysis, and noise environment analysis to validate the robustness and practicality of our method. The additional analyses further demonstrate that our approach performs reliably across different databases, maintains real-time processing capabilities, and is robust to noisy environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Feature-Enhanced Multi-Task Learning for Speech Emotion Recognition Using Decision Trees and LSTM

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Jul 10, 2024
License type: CC BY 4.0

Similar Papers

Speech emotion recognition based on multi-feature speed rate and LSTM
Zijun Yang ... Seiichi Serikawa
Neurocomputing | VOL. 601
Zijun Yang, et. al.Zijun Yang ... Seiichi Serikawa
09 Jul 2024
Neurocomputing | VOL. 601

Rule-Extraction from Soft Decision Trees
Lin Huang ... Mohammad Reza Rajati
-
Lin Huang, et. al.Lin Huang ... Mohammad Reza Rajati
01 Jun 2018
01 Jun 2018

Multi-task Learning for Speech Emotion and Emotion Intensity Recognition
Pengcheng Yue ... Leyuan Qu
-
Pengcheng Yue, et. al.Pengcheng Yue ... Leyuan Qu
07 Nov 2022
07 Nov 2022

In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–
Yeşim Ülgen Sönmez ... Asaf Varol
Intelligent Systems with Applications | VOL. 22
Yeşim Ülgen Sönmez, et. al.Yeşim Ülgen Sönmez ... Asaf Varol
11 Mar 2024
Intelligent Systems with Applications | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Feature-Enhanced Multi-Task Learning for Speech Emotion Recognition Using Decision Trees and LSTM

Abstract

Talk to us

Similar Papers

More From: Electronics