Abstract
This paper presents a deep end-to-end network for high dynamic range (HDR) imaging of dynamic scenes with background and foreground motions. Generating an HDR image from a sequence of multi-exposure images is a challenging process when the images have misalignments by being taken in a dynamic situation. Hence, recent methods first align the multi-exposure images to the reference by using patch matching, optical flow, homography transformation, or attention module before the merging. In this paper, we propose a deep network that synthesizes the aligned images as a result of blending the information from multi-exposure images, because explicitly aligning photos with different exposures is inherently a difficult problem. Specifically, the proposed network generates under/over-exposure images that are structurally aligned to the reference, by blending all the information from the dynamic multi-exposure images. Our primary idea is that blending two images in the deep-feature-domain is effective for synthesizing multi-exposure images that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, our alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned images with a wide range of exposure differences very well and thus can be effectively used for the HDR imaging of dynamic scenes. Moreover, by adding a simple merging network after the alignment network and training the overall system end-to-end, we obtain a performance gain compared to the recent state-of-the-art methods.
Highlights
Dynamic ranges of standard cameras are too narrow when compared with those of most scenes around us
We show that the ExposureStructure Blending Network (ESBN) works well for quite large exposure differences, and any existing high dynamic range (HDR) imaging methods can use the outputs of our ESBN for generating a plausible HDR image
We use PSNR and SSIM to evaluate the aligned results, and we use HDRVDP-2, PSNR, and SSIM to show the quality of the final HDR image
Summary
Dynamic ranges of standard cameras are too narrow when compared with those of most scenes around us. They cannot capture too bright or dark regions that have illumination values out of the ranges of normal camera-settings. The most common approach is to take a sequence of low dynamic range (LDR) images with different exposures and fuse them to an HDR image [7], [13]. The HDR image is appropriately tonemapped to the display dynamic range [8], [35], [43]. Another approach is directly synthesizing a tonemapped-like
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have