Head Pose Estimation Patterns as Deepfake Detectors

Federico Becattini,Chiara Pero,Vincenzo Loia,Fei Hao,Carmen Bisogni

doi:10.1145/3612928

Abstract

The capacity to create ”fake” videos has recently raised concerns about the reliability of multimedia content. Identifying between true and false information is a critical step toward resolving this problem. On this issue, several algorithms utilizing deep learning and facial landmarks have yielded intriguing results. Facial landmarks are traits that are solely tied to the subject’s head posture. Based on this observation, we study how Head Pose Estimation (HPE) patterns may be utilized to detect deepfakes in this work. The HPE patterns studied are based on FSA-Net, SynergyNet, and WSM, which are among the most performant approaches on the state of the art. Finally, using a machine learning technique based on K-Nearest Neighbor and Dynamic Time Warping, their temporal patterns are categorized as authentic or false. We also offer a set of experiments for examining the feasibility of using deep learning techniques on such patterns. The findings reveal that the ability to recognize a deepfake video utilizing an HPE pattern is dependent on the HPE methodology. On the contrary, performance is less dependent on the performance of the utilized HPE technique. Experiments are carried out on the FaceForensics++ dataset, that presents both identity swap and expression swap examples. The findings show that FSA-Net is an effective feature extraction method for determining whether a pattern belongs to a deepfake or not. The approach is also robust in comparison to deepfake videos created using various methods or for different goals. In mean the method obtain 86% of accuracy on the identity swap task and 86.5% of accuracy on the expression swap. These findings offer up various possibilities and future directions for solving the deepfake detection problem using specialized HPE approaches, which are also known to be fast and reliable.

Full Text