Abstract

Infrared and visible image fusion methods aim to combine salient target instances and abundant texture details into fused images. However, due to the interference of harsh conditions, such as dense smoke, fog, and intense light, it is feasible for external interference information to be integrated into the fused image, which seriously affects the image quality. To this end, we propose an infrared and visible image fusion network with visual perception and cross-scale attention modules, termed VCAFusion, which integrates critical information from source images in harsh conditions more efficiently. Specifically, considering that the human eye can identify key information under adverse conditions, we design a visual perception module (VPM), guiding information integration from the perspective of human visual perception. In addition, we propose a cross-scale attention module (CSAM) based on shifted window cross-attention, which aims to capture the long-distance correlation between adjacent scale features, providing more accurate image information for image restoration. Experiments with a variety of datasets reveal that the VCAFusion can adaptively retain image information and further improve image generation ability in harsh conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call