MixFuse: An iterative mix-attention transformer for multi-modal image fusion

Jinfu Li,Hong Song,Lei Liu,Yanan Li,Jianghan Xia,Yuqi Huang,Jingfan Fan,Yucong Lin,Jian Yang

doi:10.1016/j.eswa.2024.125427

Abstract

Multi-modal image fusion plays a crucial role in various visual systems. However, existing methods typically involve a multi-stage pipeline, i.e., feature extraction, integration, and reconstruction, which limits the effectiveness and efficiency of feature interaction and aggregation. In this paper, we propose MixFuse, a compact multi-modal image fusion framework based on Transformers. It smoothly unifies the process of feature extraction and integration, As its core, the Mix Attention Transformer Block (MATB) integrates the Cross-Attention Transformer Module (CATM) and the Self-Attention Transformer Module (SATM). The CATM introduces a symmetrical cross-attention mechanism to identify modality-specific and general features, filtering out irrelevant and redundant information. Meanwhile, the SATM is designed to refine the combined features via a self-attention mechanism, enhancing the internal consistency and proper preservation of the features. This successive cross and self-attention modules work together to enhance the generation of more accurate and refined feature maps, which are essential for later reconstruction. Extensive evaluation of MixFuse on five public datasets shows its superior performance and adaptability over state-of-the-art methods. The code and model will be released at https://github.com/Bitlijinfu/MixFuse.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MixFuse: An iterative mix-attention transformer for multi-modal image fusion

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications

Lead the way for us

Similar Papers

Utilisation of Deep Learning with Multimodal Data Fusion for Determination of Pineapple Quality Using Thermal Imaging
Maimunah Mohd Ali ... Ola Lasekan
Agronomy | VOL. 13
Maimunah Mohd Ali, et. al.Maimunah Mohd Ali ... Ola Lasekan
30 Jan 2023
Agronomy | VOL. 13

Consistency of neurovascular relationship between multimodal image fusion 3D reconstruction and intraoperative findings of microvascular decompression for primary trigeminal neuralgia
...
Chinese Journal of Neurosurgery | VOL. 35
, et. al. ...
28 Dec 2019
Chinese Journal of Neurosurgery | VOL. 35

Multimodal image fusion: A systematic review
Shrida Kalamkar ... Geetha Mary A
Decision Analytics Journal | VOL. 9
Shrida Kalamkar, et. al.Shrida Kalamkar ... Geetha Mary A
21 Sep 2023
Decision Analytics Journal | VOL. 9

Multimodal medical image fusion towards future research: A review
Sajid Ullah Khan ... Muhammad Javed
Journal of King Saud University - Computer and Information Sciences | VOL. 35
Sajid Ullah Khan, et. al.Sajid Ullah Khan ... Muhammad Javed
29 Aug 2023
Journal of King Saud University - Computer and Information Sciences | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MixFuse: An iterative mix-attention transformer for multi-modal image fusion

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications