The pervasive prevalence of DeepFakes poses a profound threat to individual privacy and the stability of society. Believing the synthetic videos of a celebrity and trumping up impersonated forgery videos as authentic are just a few consequences generated by DeepFakes. We investigate current detectors that blindly deploy deep learning techniques that are not effective in capturing subtle clues of forgery when generative models produce remarkably realistic faces. Inspired by the fact that synthetic operations inevitably modify the regions of eyes and mouth to match the target face with the identity or expression of the source face, we conjecture that the continuity of facial movement patterns representing expressions that existed in the veritable faces will be disrupted or completely broken in synthetic faces, making it a potentially formidable indicator for DeepFake detection. To prove this conjecture, we utilize a dual‐branch network to capture the inconsistent patterns of facial movements within eyes and mouth regions separately. Extensive experiments on popular FaceForensics++, Celeb‐DF‐v1, Celeb‐DF‐v2, and DFDC‐Preview datasets have demonstrated not only effectiveness but also the robust capability of our method to outperform the state‐of‐the‐art baselines. Moreover, this work represents greater robustness against adversarial attacks, achieving ASR of 54.8% in the I‐FGSM attack and 43.1% in the PGD attack on the DeepFakes dataset of FaceForensics++, respectively.
Read full abstract