Abstract

Deep fake stands for a face swapping algorithm where the source and target can be an image or a video. Researchers have investigated sophisticated generative adversarial networks (GAN), autoencoders, and other approaches to establish precise and robust algorithms for face swapping. However the achieved results are far from perfect in terms of human and visual evaluation. In this study, we propose a new one-shot pipeline for image-to-image and image-to-video face swap solutions. We take the FaceShifter (image-to-image) architecture as a baseline approach and propose several major architecture improvements which include a new eye-based loss function, face mask smooth algorithm, a new face swap pipeline for image-to-video face transfer, a new stabilization technique to decrease face jittering on adjacent frames and a super-resolution stage. In the experimental stage, we show that our solution outperforms SoTA face swap architectures in terms of ID retrieval (+1.5% improvement), shape (the second best value) and eye gaze preserving (+1% improvement) metrics. We also established an ablation study for our solution to estimate the contribution of pipeline stages to the overall accuracy, which showed that the eye loss leads to 2% improvement in the ID retrieval and 45% improvement in the eye gaze preserving.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.