MUFusion: A general unsupervised image fusion network based on memory unit

Chunyang Cheng,Tianyang Xu,Xiao-Jun Wu

doi:10.1016/j.inffus.2022.11.010

Abstract

Existing image fusion approaches are committed to using a single deep network to solve different image fusion problems, achieving promising performance in recent years. However, devoid of the ground-truth output, in these methods, only the appearance from source images can be exploited during the training process to generate the fused images, resulting in suboptimal solutions. To this end, we advocate a self-evolutionary training formula by introducing a novel memory unit architecture (MUFusion). In this unit, specifically, we utilize the intermediate fusion results obtained during the training process to further collaboratively supervise the fused image. In this way, our fusion results can not only learn from the original input images, but also benefit from the intermediate output of the network itself. Furthermore, an adaptive unified loss function is designed based on this memory unit, which is composed of two loss items, i.e., content loss and memory loss. In particular, the content loss is calculated based on the activity level maps of source images, which can constrain the output image to contain specific information. On the other hand, the memory loss is obtained based on the previous output of our model, which is utilized to force the network to yield fusion results with higher quality. Considering the handcrafted activity level maps cannot consistently reflect the accurate salience judgement, we put two adaptive weight items between them to prevent this degradation phenomenon. In general, our MUFusion can effectively handle a series of image fusion tasks, including infrared and visible image fusion, multi-focus image fusion, multi-exposure image fusion, and medical image fusion. Particularly, the source images are concatenated in the channel dimension. After that, a densely connected feature extraction network with two scales is used to extract the deep features of the source images. Following this, the fusion result is obtained by two feature reconstruction blocks with skip connections from the feature extraction network. Qualitative and quantitative experiments on 4 image fusion subtasks demonstrate the superiority of our MUFusion, compared to the state-of-the-art methods.

Full Text