The main advantage of DDIM is that it guarantees the quality of the generated images while increasing the efficiency of the generation by modifying the sampling strategy in the diffusion process. DiffusionRig, which addresses the problem of maintaining identity consistency by learning the person-specific facial prior in a tiny personalized dataset, is a successful representation of the DDIM strategy. Based on DiffusionRig, in this article, we propose an improved face attributes editing method based on DDIM to improve naturalness and accuracy of editing results in complex face attribute editing tasks and the generalization ability. Our method combines DDIM and DECA together and use two-stage training strategy. To reduce DiffusionRig’s limitations in handling face attribute editing tasks that require nonlinear understanding and fine-tuning, our method also introduces a channel attention mechanism and a depth-separable convolution technique in the training model. In the first stage we trained our model on the open FFHQ dataset, which consists of 30,000 high-resolution face image. In the second stage, the model was refined by using a tiny personalized dataset. Face attribute editing experiments,comparative experiment with DiffusionRig, and a series of ablation experiments have been performed. Then, the validity of our improved method is verified by qualitative and quantitative analysis.
Read full abstract