Hairstyle transfer aims to combine the hairstyle attributes of the reference images with the face and background of the identity image into one realistic image. This is challenging due to the differences in semantic structure between reference images and identity image, as well as the tangled hairstyle attributes. Existing GAN-based methods tend to simply combine image patches of hair, face, and background, which leads to discordant semantic structures and obvious artifacts in the results. And the GAN inversion-based methods require elaborate orthogonalization of loss gradients with additional calculations to alleviate the entanglement between the complex hairstyle attributes. To address these limitations, we propose a novel framework for hairstyle transfer based on StyleGAN inversion. Specifically, instead of simply combining the image patches, we use StyleGAN’s generative power to get more reasonable target semantic masks. During the alignment with the target masks, background inpainting is performed by StyleGAN rather than auxiliary models, which makes the results more realistic. Furthermore, our method decomposes hairstyle into three attributes: structure, texture, and color to better control the hairstyle transfer process. Besides, we observe that the spatial scales of these attributes match those of the StyleGAN’s latent codes. Then, we decouple the latent code in the hierarchical dimension and use different layers of the latent code to complete the transfer of different hair attributes, which reduces the entanglement between attributes without extra calculations. Correspondingly, our method divides hairstyle transfer into three steps, making the transfer process more controllable. By comparing with state-of-the-art methods, we verify the effectiveness of our method.
Read full abstract