Abstract

Convolution Neural Networks (CNN) and generative adversarial networks (GAN) based approaches have achieved substantial performance in image fusion field. However, these methods focus on extracting local features and pay little attention to learning global dependencies. In recent years, given the competitive long-term dependency modeling capability, the Transformer based fusion method has made impressive achievement, but this method simultaneously processes long-term correspondences and short-term features, which might result in deficiently global–local information interaction. Towards this end, we propose a decoupled global–local infrared and visible image fusion Transformer (DGLT-Fusion). The DGLT-Fusion decouples global–local information learning into Transformer module and CNN module. The long-term dependencies are modeled by a series of Transformer blocks (global-decoupled Transformer blocks), while the short-term features are extracted by local-decoupled convolution blocks. In addition, we design Transformer dense connection to reserve more information. These two modules are interweavingly stacked that enables our network retain texture and detailed information more integrally. Furthermore, the comparative experiment results show that DGLT-Fusion achieves better performance than state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call