Abstract

Media forensics has attracted a tremendous attention in the last years in part due to the increasing concerns around DeepFakes. Since the release of the initial DeepFakes databases of the 1st generation such as UADFV and FaceForensics++ up to the latest databases of the 2nd generation such as Celeb-DF and DFDC, many visual improvements have been carried out, making fake videos almost indistinguishable to the human eye. This study provides an in-depth analysis of both 1st and 2nd DeepFakes generations in terms of fake detection performance. Two different methods are considered in our experimental framework: (i) the traditional one followed in the literature based on selecting the entire face as input to the fake detection system, and (ii) a novel approach based on the selection of specific facial regions as input to the fake detection system. Fusion techniques are applied both to the facial regions and also to three different state-of-the-art fake detection systems (Xception, Capsule Network, and DSP-FWA) in order to further increase the robustness of the detectors considered. Finally, experiments regarding intra- and inter-database scenarios are performed.Among all the findings resulting from our experiments, we highlight: (i) the very good results achieved using facial regions and fusion techniques with fake detection results above 99% Area Under the Curve (AUC) for UADFV, FaceForensics++, and Celeb-DF v2 databases, and (ii) the necessity to put more efforts on the analysis of inter-database scenarios to improve the ability of the fake detectors against attacks unseen during learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call