Abstract

We proposed an infrared and visible image fusion method based on the ResCC module and spatial criss-cross attention models. The proposed method adopts an auto-encoder structure consisting of an encoder network, fusion layers, and a decoder network. The encoder network has a convolution layer and three ResCC blocks with dense connections. Each ResCC block can extract multi-scale features from source images without down-sampling operations and retain as many feature details as possible for image fusion. The fusion layer adopts spatial criss-cross attention models, which can capture contextual information in both horizontal and vertical directions. Attention in these two directions can also reduce the calculation of the attention maps. The decoder network consists of four convolution layers designed to reconstruct images from the feature map. Experiments performed on the public datasets demonstrate that the proposed method obtains better fusion performance on objective and subjective evaluations compared to other advanced fusion methods. The code is available at https://github.com/xiongzhangzzz/ResCCFusion.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.