Abstract

Image fusion helps in merging two or more images to construct a more informative single fused image. Recently, unsupervised learning-based convolutional neural networks (CNN) have been used for different types of image-fusion tasks such as medical image fusion, infrared-visible image fusion for autonomous driving as well as multi-focus and multi-exposure image fusion for satellite imagery. However, it is challenging to analyze the reliability of these CNNs for the image-fusion tasks since no groundtruth is available. This led to the use of a wide variety of model architectures and optimization functions yielding quite different fusion results. Additionally, due to the highly opaque nature of such neural networks, it is difficult to explain the internal mechanics behind its fusion results. To overcome these challenges, we present a novel real-time visualization tool, named FuseVis, with which the end-user can compute per-pixel saliency maps that examine the influence of the input image pixels on each pixel of the fused image. We trained several image fusion-based CNNs on medical image pairs and then using our FuseVis tool we performed case studies on a specific clinical application by interpreting the saliency maps from each of the fusion methods. We specifically visualized the relative influence of each input image on the predictions of the fused image and showed that some of the evaluated image-fusion methods are better suited for the specific clinical application. To the best of our knowledge, currently, there is no approach for visual analysis of neural networks for image fusion. Therefore, this work opens a new research direction to improve the interpretability of deep fusion networks. The FuseVis tool can also be adapted in other deep neural network-based image processing applications to make them interpretable.

Highlights

  • The recent development of state-of-the-art imaging modalities has revolutionized the way we perform our everyday activities

  • We can perceive a reproduction of features from relatively brighter regions of the Magnetic Resonance Images (MRI) and Positron Emission Tomography (PET) images due to which Weighted Averaging method is unable to preserve the clinically important dark PET features related to the necrotic core but favorably preserves the bright PET features resembling healthy tissues

  • We developed an easy-to-use FuseVis tool that enables the end-user to visualize jacobian images of the selected principle pixel in real time while performing a mouseover interaction

Read more

Summary

Introduction

The recent development of state-of-the-art imaging modalities has revolutionized the way we perform our everyday activities. In self-driving cars, infrared images from the respective camera sensors positioned on the vehicle help to detect obstacles such as pedestrians at night. On the other hand, acquires multi-spectral and multi-resolution images that are needed for object detection and recognition from high altitudes. In medical diagnosis and treatment, Magnetic Resonance Images (MRI) provides a detailed view of the internal structures of the human brain such as white and gray matter whereas Positron Emission Tomography (PET) and single photon emission computed tomography (SPECT) images provide functional information such as glucose metabolism and extent of cerebral blood flow (CBF) or perfusion activity in the specific regions of the brain. Multimodal image fusion-based image processing technique solves this problem by combining two or more pre-registered images from single or multiple imaging modalities

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call