Abstract
Multi-modal image fusion aims to generate a fused image by integrating and distinguishing the cross-modality complementary information from multiple source images. While the cross-attention mechanism with global spatial interactions appears promising, it only captures second-order spatial interactions, neglecting higher-order interactions in both spatial and channel dimensions. This limitation hampers the exploitation of synergies between multi-modalities. To bridge this gap, we introduce a Synergistic High-order Interaction Paradigm (SHIP), designed to systematically investigate spatial fine-grained and global statistics collaborations between the multi-modal images across two fundamental dimensions: 1) Spatial dimension: we construct spatial fine-grained interactions through element-wise multiplication, mathematically equivalent to global interactions, and then foster high-order formats by iteratively aggregating and evolving complementary information, enhancing both efficiency and flexibility. 2) Channel dimension: expanding on channel interactions with first-order statistics (mean), we devise high-order channel interactions to facilitate the discernment of inter-dependencies between source images based on global statistics. We further introduce an enhanced version of the SHIP model, called SHIP++ that enhances the cross-modality information interaction representation by the cross-order attention evolving mechanism, cross-order information integration, and residual information memorizing mechanism. Harnessing high-order interactions significantly enhances our model's ability to exploit multi-modal synergies, leading in superior performance over state-of-the-art alternatives, as shown through comprehensive experiments across various benchmarks in two significant multi-modal image fusion tasks: pan-sharpening, and infrared and visible image fusion. The source code is publicly available at https://github.com/manman1995/HOIF.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on pattern analysis and machine intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.