Abstract

In this paper, we design an infrared (IR) and visible (VIS) image fusion via unsupervised dense networks, termed as TPFusion. Activity level measurements and fusion rules are indispensable parts of conventional image fusion methods. However, designing an appropriate fusion process is time-consuming and complicated. In recent years, deep learning-based methods are proposed to handle this problem. However, for multi-modality image fusion, using the same network cannot extract effective feature maps from source images that are obtained by different image sensors. In TPFusion, we can avoid this issue. At first, we extract the textural information of the source images. Then two densely connected networks are trained to fuse textural information and source image, respectively. By this way, we can preserve more textural details in the fused image. Moreover, loss functions we designed to constrain two densely connected convolutional networks are according to the characteristics of textural information and source images. Through our method, the fused image will obtain more textural information of source images. For proving the validity of our method, we implement comparison and ablation experiments from the qualitative and quantitative assessments. The ablation experiments prove the effectiveness of TPFusion. Being compared to existing advanced IR and VIS image fusion methods, our fusion results possess better fusion results in both objective and subjective aspects. To be specific, in qualitative comparisons, our fusion results have better contrast ratio and abundant textural details. In quantitative comparisons, TPFusion outperforms existing representative fusion methods.

Highlights

  • Multi-sensor image fusion is an effective technique fusing multi-source images into one image that contains complementary information for better visual understanding

  • In order to preserve the textural information of visible image and infrared image in the fused image, we calculate the gradient of the source images, and use maxi (∗) to obtain the maxi gradient of source image. |∗|l1 denotes the l1 distance

  • We present a novel network architecture for IR and Visible image (VIS) image fusion in this article, called TPFusion

Read more

Summary

Introduction

Multi-sensor image fusion is an effective technique fusing multi-source images into one image that contains complementary information for better visual understanding. In [13], authors proposed a general image fusion model based on convolutional neural networks, termed as IFCNN In this model, a simple encoder was utilized to obtain the feature information from source images. FusionGAN cannot balance the weight of the generator and the discriminator in the training process, which may result in the problem of the loss of information from the source images In addition to these representative deep learning image fusion algorithms, many of their variant algorithms have been proposed. For existing deep learning-based approaches, they aimed to design an appropriate loss function and novel architecture of the network These two parts are essential in deep learning-based fusion methods which may result in better visual effect in the fused image [25,26].

The Structure of TPFusion
Loss Function
Training
Experiment Results and Analysis
Comparison Methods
Qualitative Comparisons
Quantitative Comparisons
Ablation Experiment
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call