DO-FAM: Disentangled Non-Linear Latent Navigation For Facial Attribute Manipulation

Yifan Yuan,Junping Zhang,Hongming Shan,Siteng Ma

doi:10.1109/icassp49357.2023.10095959

Abstract

Facial attribute manipulation (FAM) aims to edit the semantic attributes of facial images according to the user’s requirements. Unfortunately, the majority of existing FAM methods struggle in meeting at least one of the two requirements: high reconstruction quality and high irrelevance preservation. To alleviate these two limitations, we propose a novel Disentangled nOn-linear latent navigation framework for FAM, termed DO-FAM. To promote the reconstruction quality, we leverage hypernetworks to fine-tune a pre-trained StyleGAN2 generator. To decouple entangled attributes, we propose a novel Disentangled nOn-Linear Latent transformation module, named DOLL, which consists of three components: (1) a decomposer to factorize input latent codes into two parts: attribute-related and attribute-unrelated; (2) a non-linear Latent Transformation Network (LTNet) to navigate the attribute-related latent codes to the target one with respect to the designed attribute(s); and (3) a latent classifier tasked with predicting latent codes’ attributes to guide the latent code navigation. Extensive experimental results on a widely-used benchmark facial editing dataset, CelebA-HQ, demonstrate the superiority of our method over state-of-the-art methods.

Full Text