In daily life, when taking photos of scenes containing glass, the images of the dominant transmission layer and the weak reflection layer are often blended, which are difficult to be uncoupled. Meanwhile, because the reflection layer contains sufficient important information about the surrounding scene and the photographer, the problem of recovering the weak reflection layer from the mixture image is of importance in surveillance investigations. However, most of the current studies mainly focus on extracting the transmission layer while often ignoring the merit of the reflection layer. To fill that gap, in this paper, we propose a network framework that aims to accomplish two tasks: (1) for general scenes, we attempt to recover reflection layer images that are as close as possible to the ground truth ones, and (2) for scenes containing portraits, we recover the basic contour information of the reflection layer while improving the defects of dim portraits in the reflection layer. Through analyzing the performance exhibited by different levels of feature maps, we present the first transmission removal network based on an image-to-image translation architecture incorporating residual structures. The quality of generated reflection layer images is improved via tailored content and style constraints. We also use the patch generative adversarial network to increase the discriminator’s ability to perceive the reflection components in the generated images. Meanwhile, the related information such as edge and color distribution of transmission layer in the mixture image is used to assist the overall reflection layer recovery. In the large-scale experiments, our proposed model outperforms reflection removal-based SOTAs by more than 5.356 dB in PSNR, 0.116 in SSIM, and 0.057 in LPIPS.
Read full abstract