The process of splitting an image into specular and diffuse components is a fundamental problem in computer vision, because most computer vision algorithms, such as image segmentation and tracking, assume diffuse surfaces, so existence of specular reflection can mislead algorithms to make incorrect decisions. Existing decomposition methods tend to work well for images with low specularity and high chromaticity, but they fail in cases of high intensity specular light and on images with low chromaticity. In this paper, we address the problem of removing high intensity specularity from low chromaticity images (faces). We introduce a new dataset, Spec-Face, comprising face images corrupted with specular lighting and corresponding ground truth diffuse images. We also introduce two deep learning models for specularity removal, Spec-Net and Spec-CGAN. Spec-Net takes an intensity channel as input and produces an output image that is very close to ground truth, while Spec-CGAN takes an RGB image as input and produces a diffuse image very similar to the ground truth RGB image. On Spec-Face, with Spec-Net, we obtain a peak signal-to-noise ratio (PSNR) of 3.979, a local mean squared error (LMSE) of 0.000071, a structural similarity index (SSIM) of 0.899, and a Fréchet Inception Distance (FID) of 20.932. With Spec-CGAN, we obtain a PSNR of 3.360, a LMSE of 0.000098, a SSIM of 0.707, and a FID of 31.699. With Spec-Net and Spec-CGAN, it is now feasible to perform specularity removal automatically prior to other critical complex vision processes for real world images, i.e., faces. This will potentially improve the performance of algorithms later in the processing stream, such as face recognition and skin cancer detection.
Read full abstract