Abstract

Existing deep learning-based fusion methods usually model local information through convolution operation or global contexts via self-attention mechanism. This serial scheme discards the local features during globality modeling, or vice versa, which may generate limited fusion performance. To tackle this issue, we introduce a mixed-frequency hierarchical guided learning network, or FreqFuse for short. More specifically, we first design a parallel frequency mixer through a channel splitting mechanism, including max-pooling and self-attention paths, to learn both high and low-frequency information. The mixer can provide more comprehensive features within a wide frequency range compared with a single dependency. Second, we develop a dual-Transformer integration module to guide fusion progress. The assigned weights are calculated by cross-token and cross-channel Transformer, which are used to measure the activity levels of source images and preserve their modality characteristics in the intermediate fused features. On this basis, we build a hierarchical guidance decoder to reconstruct a final fusion image. The cross-scale mixed-frequency features are reused to gradually optimize the activity levels of different modality images, and promote the fused result to be highly informative and strongly characterized. We benchmark the proposed FreqFuse on different datasets, and experimental results demonstrate that it achieves impressive performance compared with other methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.