The majority of existing face inpainting methods primarily focus on generating a single result that visually resembles the original image. The generation of diverse and plausible results has emerged as a new branch in image restoration, often referred to as “Pluralistic Image Completion”. However, most diversity methods simply use random latent vectors to generate multiple results, leading to uncontrollable outcomes. To overcome these limitations, we introduce a novel architecture known as the Reference-Guided Directional Diverse Face Inpainting Network. In this paper, instead of using a background image as reference, which is typically used in image restoration, we have used a face image, which can have many different characteristics from the original image, including but not limited to gender and age, to serve as a reference face style. Our network firstly infers the semantic information of the masked face, i.e., the face parsing map, based on the partial image and its mask, which subsequently guides and constrains directional diverse generator network. The network will learn the distribution of face images from different domains in a low-dimensional manifold space. To validate our method, we conducted extensive experiments on the CelebAMask-HQ dataset. Our method not only produces high-quality oriented diverse results but also complements the images with the style of the reference face image. Additionally, our diverse results maintain correct facial feature distribution and sizes, rather than being random. Our network has achieved SOTA results in face diverse inpainting when writing. Code will is available at https://github.com/nothingwithyou/RDFINet.
Read full abstract