Abstract

In this paper, we present a new effective infrared (IR) and visible (VIS) image fusion method by using a deep neural network. In our method, a Siamese convolutional neural network (CNN) is applied to automatically generate a weight map which represents the saliency of each pixel for a pair of source images. A CNN plays a role in automatic encoding an image into a feature domain for classification. By applying the proposed method, the key problems in image fusion, which are the activity level measurement and fusion rule design, can be figured out in one shot. The fusion is carried out through the multi-scale image decomposition based on wavelet transform, and the reconstruction result is more perceptual to a human visual system. In addition, the visual qualitative effectiveness of the proposed fusion method is evaluated by comparing pedestrian detection results with other methods, by using the YOLOv3 object detector using a public benchmark dataset. The experimental results show that our proposed method showed competitive results in terms of both quantitative assessment and visual quality.

Highlights

  • Infrared (IR) and visual (VIS) image fusion technology is utilized to generate a composite image from multiple spectral source images for combining complementary information of the same scene.The input source images are captured from different imaging modalities with different parameter settings

  • The typical methods surveyed in [32] are LP, Wavelet, NSCT3, dual-tree multi-resolution discrete cosine transform (DTMDCT), cross bilateral filter (CBF), hybrid multi-scale decomposition (HMSD), guided filtering-based fusion (GFF), anisotropic diffusion-based fusion (ADF), ASR, LP and sparse representation (SR) (LPSR), orientation information-motivated PCNN (OI-PCNN), SF motivated PCNNs in NSCT domain (NSCT-SF-PCNN), directional discrete cosine transform and PCA (DDCTPCA), FPDE, two-scale image fusion based on visual saliency (TSIFVS), local edge-preserving LC (LEPLC), gradient transfer fusion (GTF), and IFEVIP

  • The LP, Wavelet, NSCT, DTMDCT, CBF, HMSD, GFF, and ADF are typical multi-scale transform-based methods, ASR and LPSR belongs to SR-based methods, orientation information-motivated PCNN (OIPCNN)

Read more

Summary

Introduction

Infrared (IR) and visual (VIS) image fusion technology is utilized to generate a composite image from multiple spectral source images for combining complementary information of the same scene.The input source images are captured from different imaging modalities with different parameter settings. Infrared (IR) and visual (VIS) image fusion technology is utilized to generate a composite image from multiple spectral source images for combining complementary information of the same scene. The fused image is expected to be more suitable for human perception than any of the individual input image. Due to this advantage, image fusion techniques have wide applications in image processing and computer vision areas to improve the visual ability of human and machine vision. The general framework of image fusion is extracting representative salient features from source images of the same scene, and the salient features are integrated into a single image by a proper fusion method.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call