Multimodal Affective Computing With Dense Fusion Transformer for Inter- and Intra-Modality Interactions

Huan Deng,Zhenguo Yang,Wenyin Liu,Tianyong Hao,Qing Li

doi:10.1109/tmm.2022.3211197

Abstract

This paper proposes a dense fusion transformer (DFT) framework to integrate textual, acoustic, and visual information for multimodal affective computing. DFT exploits a modality-shared transformer (MT) module to extract the modality-shared features by modelling unimodal, bimodal, and trimodal interactions jointly. MT constructs a series of dense fusion blocks to fuse utterance-level sequential features of the multiple modalities from the perspectives of low-level and high-level semantics. In particular, MT adopts local and global transformers to learn modality-shared representations by modelling inter- and intra-modality interactions. Furthermore, we devise a modality-specific representation (MR) module with a soft orthogonality constraint to penalize the distance between modality-specific and modality-shared representations, which are fused by a transformer to make affective predictions. Extensive experiments conducted on five public benchmark datasets show that DFT outperforms the state-of-the-art baselines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multimodal Affective Computing With Dense Fusion Transformer for Inter- and Intra-Modality Interactions

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2023
Citations: 6

Similar Papers

Co-learning-assisted progressive dense fusion network for cardiovascular disease detection using ECG and PCG signals
Haobo Zhang ... Qiang Li
Expert Systems With Applications | VOL. 238
Haobo Zhang, et. al.Haobo Zhang ... Qiang Li
14 Oct 2023
Expert Systems With Applications | VOL. 238

CACRM: Cross-Attention Based Image-Text CrossModal Retrieval
Huimin Yu ... Jiahong Sui
-
Huimin Yu, et. al.Huimin Yu ... Jiahong Sui
01 Aug 2022
01 Aug 2022

Supervised Contrastive Learning for Robust and Efficient Multi-modal Emotion and Sentiment Analysis
Ahmed Gomaa ... Andreas Maier
-
Ahmed Gomaa, et. al.Ahmed Gomaa ... Andreas Maier
21 Aug 2022
21 Aug 2022

Multimodal emotion recognition: A comprehensive review, trends, and challenges
Manju Priya Arthanarisamy Ramaswamy ... Suja Palaniswamy
WIREs Data Mining and Knowledge Discovery | VOL. -
Manju Priya Arthanarisamy Ramaswamy, et. al.Manju Priya Arthanarisamy Ramaswamy ... Suja Palaniswamy
08 Oct 2024
WIREs Data Mining and Knowledge Discovery | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multimodal Affective Computing With Dense Fusion Transformer for Inter- and Intra-Modality Interactions

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia