Non-linear and selective fusion of cross-modal images

Aiqing Fang,Xinbo Zhao,Jiaqi Yang,Yanning Zhang,Xiang Zheng

doi:10.1016/j.patcog.2021.108042

Abstract

Existing image fusion methods pay little research attention to human visual characteristics. However, human visual characteristics play an important role in visual processing tasks. To solve this problem, we propose a cross-modal image fusion method that combines illuminance factors and attention mechanisms. Human visual characteristics are studied and simulated in cross-modal image fusion task. Firstly, in order to reject high and low-frequency mixing and reduce the halo effect, we perform cross-modal image multi-scale decomposition. Secondly, in order to remove highlights, the visual saliency map and the deep feature map are combined with the illuminance fusion factor to perform high-low frequency non-linear fusion. Thirdly, the feature maps are selected through a channel attention network to obtain the final fusion map. Finally, we validate our image fusion method on public datasets of infrared and visible images. The experimental results demonstrate the superiority of our fusion method under the complex illumination environment. In addition, the experimental results also demonstrate the effectiveness of our simulation of human visual characteristics.

Full Text