Abstract
In this paper, we propose a novel unsupervised model for multi-focus image fusion based on gradients and connected regions, termed as GCF. To overcome the stumbling block of vanishing gradients in applying deep networks for multi-focus image fusion, we design the Mask-Net which can directly generate a binary mask. Thus, there is no need for hand-crafted feature extraction or fusion rules. Based on the fact that objects within the depth-of-field (DOF) have shaper appearance, i.e., larger gradients, we use the gradient relation map obtained from source images to narrow the solution domain and speed up convergence. Then, the constraint of connected region numbers is conductive to finding the more accurate binary mask. With the consistency verification strategy, the final mask can be obtained by adapting the initial binary mask to generate the fused result. Therefore, the proposed method is an unsupervised model without the need of the ground-truth data. Both qualitative and quantitative experiments are conducted on the publicly available Lytro dataset. The results show that GCF can outperform the state-of-the-art in both visual perception and objective metrics.
Highlights
Due to limitations of hardware devices, the digital single-lens reflex camera cannot capture all the information in the scene and some information may be lost inevitably
2) COMPARISON RESULTS To further validate the effectiveness of our proposed method, we compare our method with seven state-of-theart fusion methods, including DCTVar [33], dense scale invariant feature transform (DSIFT) [14], MST_SR [34], GBM [35], convolutional neural networks (CNNs) [26], DenseFuse [21] and SESF-Fuse [22]
In this paper, we put forward a gradients and connected regions-based unsupervised model for multi-focus image fusion, termed as GCF
Summary
Due to limitations of hardware devices, the digital single-lens reflex camera cannot capture all the information in the scene and some information may be lost inevitably. As a supervised deep model, Liu et al proposed a CNN-based multi-focus fusion methods [26] It employed the original images as the clear images and those after Gaussian convolution as the blurred images to train a network that can determine whether each pixel is focused. The loss function based on connected regions is conductive to finding the more accurate binary mask It can overcome the stumbling block of directly training a CNN to generate a binary mask by solving the problem of vanishing gradients. It is an unsupervised model without manually synthesized dataset as ground-truth. The Mask-Net in the proposed GCF can be trained to generate the mask directly without the need of hand-crafted feature extraction or fusion rules
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.