HiMul-LGG: A hierarchical decision fusion-based local–global graph neural network for multimodal emotion recognition in conversation

Changzeng Fu,Fengkui Qian,Kaifeng Su,Yikai Su,Ze Wang,Jiaqi Shi,Zhigang Liu,Chaoran Liu,Carlos Toshinori Ishi

doi:10.1016/j.neunet.2024.106764

Changzeng Fu, Fengkui Qian + Show 7 more

https://doi.org/10.1016/j.neunet.2024.106764

Copy DOI

Export

Save

Cite

Journal: Neural Networks

Publication Date: Sep 28, 2024

Abstract
Full-Text
Similar Papers

Abstract

Listen

Emotion recognition in conversation (ERC) is a vital task that requires deciphering human emotions through analysis of contextual and multimodal information. However, extant research on ERC concentrates predominantly on investigating multimodal fusion while overlooking the model’s constraints in dealing with unimodal representation discrepancy and speaker dependencies. To address the aforementioned problems, this paper proposes a Hierarchical decision fusion-based Local–Global Graph Neural Network for multimodal ERC (HiMul-LGG). HiMul-LGG employs a hierarchical decision fusion strategy to ensure feature alignment across modalities. Moreover, HiMul-LGG also adopts a local–global graph neural network architecture to reinforce inter-modality and intra-modality speaker dependency. Additionally, HiMul-LGG utilizes a cross-modal multi-head attention mechanism to promote interplay between modalities. We evaluate HiMul-LGG on two emotion recognition datasets, IEMOCAP and MELD, where HiMul-LGG outperforms existing methods. The results of the ablation study also imply the effectiveness of the proposed hierarchical decision fusion strategy and local–global structure of Graph construction.

Full Text