Abstract

Facial image super-resolution (SR) is an important aspect of facial analysis, and it can contribute significantly to tasks such as face alignment, face recognition, and image-based 3D reconstruction. Recent convolutional neural network (CNN) based models have exhibited significant advancements by learning mapping relations using pairs of low-resolution (LR) and high-resolution (HR) facial images. However, because these methods are conventionally aimed at increasing the PSNR and SSIM metrics, the reconstructed HR images might be blurry and have an overall unsatisfactory perceptual quality even when state-of-the-art quantitative results are achieved. In this study, we address this limitation by proposing an adversarial framework intended to reconstruct perceptually high-quality HR facial images while simultaneously removing blur. To this end, a simple five-layer CNN is employed to extract feature maps from LR facial images, and this feature information is provided to two-branch encoder-decoder networks that generate HR facial images with and without blur. In addition, local and global discriminators are combined to focus on the reconstruction of HR facial structures. Both qualitative and quantitative results demonstrate the effectiveness of the proposed method for generating photorealistic HR facial images from a variety of LR inputs. Moreover, it was also verified, through a use case scenario that the proposed method can contribute more to the field of face recognition than existing approaches.

Highlights

  • Blurry and low resolution (LR) facial images, which are frequently observed in surveillance videos and old video footage, are fundamental problems in computer vision and image processing

  • Ensuring a high performance is difficult when such factors degrade the facial images used for face-related tasks, such as face landmark detection [1], face recognition [2], face parsing [3], and 3D face reconstruction [4], [5]

  • The results show that the proposed method can generate HR facial images without blur, in addition to generating photorealistic and clear facial images, even when the input facial image has experienced significant degradation

Read more

Summary

INTRODUCTION

Blurry and low resolution (LR) facial images, which are frequently observed in surveillance videos and old video footage, are fundamental problems in computer vision and image processing. FSRNet uses face landmark heatmaps and parsing maps as face prior information, and these are estimated in a prior estimation network They proposed FSRGAN to incorporate adversarial loss into FSRNet. They proposed FSRGAN to incorporate adversarial loss into FSRNet Their approach exhibits a higher performance than that of existing methods by generating face prior information and reconstructing an HR facial image. OVERVIEW OF PROPOSED METHOD the proposed network consists of the following components: a five-layer CNN, face region prior, and generator G with a modified U-Net [34] structure, as well as global and local discriminators. B. GENERATOR MODULE The generator network G includes a five-layer CNN, which extracts feature maps with an 8x spatial resolution instead of performing simple 8x upscaling of the input images. The weights λ1 and λ2 are used to balance the contribution of each loss

EXPERIMENTAL RESULTS
DATASETS
LIMITATIONS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call