Abstract

Deepfake detection comes as a countermeasure to identify fake media content to reduce its harmful implications. Most detection approaches rely on identifying specific artifacts that can quickly become obsolete due to the fast advance in facial forgery methods. Some facial manipulation detection methods use temporal information to classify the video as real or fake. These methods mainly rely on 3D CNN architectures or two-stream networks using frame and video features. Our method not only considers temporal aspects, but it comes from a different perspective: extracting features that can account for inter-frame changes on a video. Inspired by the concept of ratio images, we extract features based on the ratio between adjacent frames for the face and its background. The experimental evaluation showed better results in intra- and cross-dataset tests on FaceForensics++ (FF++) and CelebDF datasets compared to the state-of-the-art deepfake detection approaches in the assessment with seen and unseen facial manipulation methods, as well as in seen and unseen video settings. In the intra-dataset experiment, the model resulted in an AUC of 100% for both CelebDF and FF++ datasets. In the cross dataset experiment, the model resulted in an AUC of 98% when trained with CelebDF and tested with FF++ and 86% when trained with FF++ and tested with CelebDF.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call