Abstract

High-dynamic range imaging technology is an effective method to improve the limitations of a camera’s dynamic range. However, most current high-dynamic imaging technologies are based on image fusion of multiple frames with different exposure levels. Such methods are prone to various phenomena, for example motion artifacts, detail loss and edge effects. In this paper, we combine a dual-channel camera that can output two different gain images simultaneously, a semi-supervised network structure based on an attention mechanism to fuse multiple gain images is proposed. The proposed network structure comprises encoding, fusion and decoding modules. First, the U-Net structure is employed in the encoding module to extract important detailed information in the source image to the maximum extent. Simultaneously, the SENet attention mechanism is employed in the encoding module to assign different weights to different feature channels and emphasis important features. Then, a feature map extracted from the encoding module is input to the decoding module for reconstruction after fusing by the fusion module to obtain a fused image. Experimental results indicate that the fused images obtained by the proposed method demonstrate clear details and high contrast. Compared with other methods, the proposed method improves fused image quality relative to several indicators.

Highlights

  • In the traditional camera structure, due to the limitations of the physical characteristics of CCD, it is difficult to capture the entire dynamic range of the scene that meets the characteristics of the human eye for a single camera exposure [1], which significantly affects the visual effect of the image

  • HDR imaging technology can obtain a wide dynamic range image by fusing multiple frames of different exposure images of the same scene, which can effectively overcome the problem of the narrow dynamic range of cameras and improve image quality

  • In view of the above problems, we propose a semi-supervised network structure to fuse multi-gain images captured by dual-channel cameras

Read more

Summary

Introduction

In the traditional camera structure, due to the limitations of the physical characteristics of CCD, it is difficult to capture the entire dynamic range of the scene that meets the characteristics of the human eye for a single camera exposure [1], which significantly affects the visual effect of the image. Ma et al [12] proposed a multi-exposure image fusion algorithm based on structural block decomposition that decomposes image blocks into three independent components and processes them separately to obtain a fused image This method can better retain the structural information of the image but is prone to block effects that seriously affect image visual effects. Prabhakar et al [18] proposed using an unsupervised deep learning framework to fuse multi-exposure images for the first time With this method, the network fusion effect is better than traditional methods; the feature extraction structure of the network is too simple, and deep feature extraction of the image is insufficient.

Through
Encoding Structure
Fusion
Decoding Module
Loss Function
Hardware Platform
Dasetset and Training Strategy
Dasetset and Note
Validation
Result and Analysis
Fusion Results abcdefg
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.