Deepfake Detection Via Separable Self-Consistency Learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Deepfake detection technologies have been developed rapidly in recent years, due to the potential severe security threats induced by the realistic deep facial forgeries. Among the existing deepfake detection methods, self-supervised methods have drawn significant attentions from researchers, because of their better generalization ability against the deep forgeries produced via unseen deepfake techniques. Unfortunately, existing state-of-the-art self-supervised approaches have not properly considered that different pairs of patches from different regions actually give different contributions. Thus, their learned representations are coarse and the generalization performances are less decent. In this paper, we propose a new self-supervised deepfake detection method, named deepfake detection via separable self-consistency learning (SSCLDFD), to improve the generalization ability of deepfake detection. Specifically, to effectively extract detection features, we construct a multi-scale Texture Enhanced Feature Extraction Network (TEFEN), by forming a Central-Difference based Convolution Module (CDCM) to enhance the texture information, which contain rich forgery cues. Since different pairs of patches from different regions (i.e. background and facial regions) tend to give various consistencies, we propose a separable self-consistency loss to explicitly constrain the representation learning. Extensive experiments demonstrate that our SSCL-DFD can give superior generalization performances compared to the state-of-the-art methods.

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.1049/bme2/2217175
Wavelet‐Based Texture Mining and Enhancement for Face Forgery Detection
  • Jan 1, 2025
  • IET Biometrics
  • Xin Li + 3 more

Due to the abuse of deep forgery technology, the research on forgery detection methods has become increasingly urgent. The corresponding relationship between the frequency spectrum information and the spatial clues, which is often neglected by current methods, could be conducive to a more accurate and generalized forgery detection. Motivated by this inspiration, we propose a wavelet‐based texture mining and enhancement framework for face forgery detection. First, we introduce a frequency‐guided texture enhancement (FGTE) module that mining the high‐frequency information to improve the network’s extraction of effective texture features. Next, we propose a global–local feature refinement (GLFR) module to enhance the model’s leverage of both global semantic features and local texture features. Moreover, the interactive fusion module (IFM) is designed to fully incorporate the enhanced texture clues with spatial features. The proposed method has been extensively evaluated on five public datasets, such as FaceForensics++ (FF++), deepfake (DF) detection (DFD) challenge (DFDC), Celeb‐DFv2, DFDC preview (DFDC‐P), and DFD, for face forgery detection, yielding promising performance within and cross dataset experiments.

  • Research Article
  • Cite Count Icon 73
  • 10.3390/electronics13010095
A Comprehensive Review of DeepFake Detection Using Advanced Machine Learning and Fusion Methods
  • Dec 25, 2023
  • Electronics
  • Gourav Gupta + 5 more

Recent advances in Generative Artificial Intelligence (AI) have increased the possibility of generating hyper-realistic DeepFake videos or images to cause serious harm to vulnerable children, individuals, and society at large with misinformation. To overcome this serious problem, many researchers have attempted to detect DeepFakes using advanced machine learning techniques and advanced fusion techniques. This paper presents a detailed review of past and present DeepFake detection methods with a particular focus on media-modality fusion and machine learning. This paper also provides detailed information on available benchmark datasets in DeepFake detection research. This review paper addressed the 67 primary papers that were published between 2015 and 2023 in DeepFake detection, including 55 research papers in image and video DeepFake detection methodologies and 15 research papers on identifying and verifying speaker authentication. This paper offers lucrative information on DeepFake detection research and offers a unique review analysis of advanced machine learning and modality fusion that sets it apart from other review papers. This paper further offers informed guidelines for future work in DeepFake detection utilizing advanced state-of-the-art machine learning and information fusion models that should support further advancement in DeepFake detection for a sustainable and safer digital future.

  • Research Article
  • Cite Count Icon 45
  • 10.1016/j.eswa.2023.119843
DFGNN: An interpretable and generalized graph neural network for deepfakes detection
  • Mar 11, 2023
  • Expert Systems with Applications
  • Fatima Khalid + 4 more

DFGNN: An interpretable and generalized graph neural network for deepfakes detection

  • Research Article
  • 10.55041/ijsrem42998
An API-Integrated CNN–RNN Framework for Scalable Deepfake Detection
  • Mar 25, 2025
  • INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
  • Rajdeep Paul

Deepfake technology has been rapidly advancing, posing significant threats to media authenticity, cybersecurity, and public trust. It's a serious threat to identity verification in the digital medium. To tackle and solve this serious problem a deepfake detection approach is taken. A Convolutional Neural Network (CNN) algorithm named Resnext and a Recurrent Neural Network (RNN) algorithm named Long Term Short Memory (LTSM) is used to train a deepfake detection model. The whole approach and process is discussed. The model accuracy obtained is 91% using Celeb-Df dataset, Then the integration concept is discussed and how we can use this model as an Application Programming Interface (API) service for platforms or users. The model can be accessed through API to detect deepfakes and provide accurate output to validate authenticity of digital content. This work not only proposes a deepfake detection solution but also tries to practically implement the deepfake detection research outcome for real world use cases. Key Words: Deepfake detection, CNN, RNN, API integration, ResNext, LTSM

  • Research Article
  • Cite Count Icon 25
  • 10.1145/3625100
Joint Audio-Visual Attention with Contrastive Learning for More General Deepfake Detection
  • Jan 22, 2024
  • ACM Transactions on Multimedia Computing, Communications, and Applications
  • Yibo Zhang + 2 more

With the continuous advancement of deepfake technology, there has been a surge in the creation of realistic fake videos. Unfortunately, the malicious utilization of deepfake poses a significant threat to societal morality and political security. Therefore, numerous researchers have proposed various deepfake detection methods. However, traditional deepfake approaches tend to focus on specific forgery features, such as artifacts or inconsistent actions, which can be vulnerable to specialized countermeasures. Recent studies show an intrinsic correlation between facial and audio cues, which can be exploited for deepfake detection. To address these challenges and enhance the robustness and generalization of deepfake detection algorithms, we propose a novel joint audio-visual deepfake detection model named AVA-CL, which is capable of detecting deepfakes in both audio and visual domains. Furthermore, exploiting the inherent correlation and consistency between audio and visual enhances the effectiveness of deepfake detection significantly. Through extensive experiments, we demonstrate that our proposed AVA-CL model outperforms many state-of-the-art (SOTA) methods with superior robustness and generalization capabilities. This research presents a promising approach for deepfake detection and reducing the harm caused by malicious use.

  • Research Article
  • Cite Count Icon 1
  • 10.60087/jaigs.v6i1.225
Adversarial Approaches to Deepfake Detection: A Theoretical Framework for Robust Defense
  • Sep 21, 2024
  • Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023
  • Sumit Lad

The rapid improvements in capabilities of neural networks and generative adversarial networks (GANs) has given rise to extremely sophisticated deepfake technologies. This has made it very difficult to reliably recognize fake digital content. It has enabled the creation of highly convincing synthetic media which can be used in malicious ways in this era of user generated information and social media. Existing deepfake detection techniques are effective against early iterations of deepfakes but get increasingly vulnerable to more sophisticated deepfakes and adversarial attacks. In this paper we explore a novel approach to deepfake detection which uses a framework to integrate adversarial training to improve the robustness and accuracy of deepfake detection models. By looking deeper into state of art adversarial machine learning, forensic analysis and deepfake detection techniques we will explore how adversarial training can improve the robustness of deep fake detection techniques against future threats. We will use perturbations which are adversarial examples designed specifically to deceive the deepfake detection algorithms. By training deepfake detection models with these perturbations we will create detection systems that can more accurately identify deepfakes. Our approach shows promise and avenues for future research in building resilience against deepfakes and applications in content moderation, security and combating synthetic media manipulation.

  • Research Article
  • Cite Count Icon 15
  • 10.1016/j.neunet.2024.106636
GazeForensics: DeepFake detection via gaze-guided spatial inconsistency learning
  • Aug 14, 2024
  • Neural Networks
  • Qinlin He + 4 more

GazeForensics: DeepFake detection via gaze-guided spatial inconsistency learning

  • Research Article
  • Cite Count Icon 172
  • 10.1007/s11263-022-01606-8
Countering Malicious DeepFakes: Survey, Battleground, and Horizon
  • Jan 1, 2022
  • International Journal of Computer Vision
  • Felix Juefei-Xu + 5 more

The creation or manipulation of facial appearance through deep generative approaches, known as DeepFake, have achieved significant progress and promoted a wide range of benign and malicious applications, e.g., visual effect assistance in movie and misinformation generation by faking famous persons. The evil side of this new technique poses another popular study, i.e., DeepFake detection aiming to identify the fake faces from the real ones. With the rapid development of the DeepFake-related studies in the community, both sides (i.e., DeepFake generation and detection) have formed the relationship of battleground, pushing the improvements of each other and inspiring new directions, e.g., the evasion of DeepFake detection. Nevertheless, the overview of such battleground and the new direction is unclear and neglected by recent surveys due to the rapid increase of related publications, limiting the in-depth understanding of the tendency and future works. To fill this gap, in this paper, we provide a comprehensive overview and detailed analysis of the research work on the topic of DeepFake generation, DeepFake detection as well as evasion of DeepFake detection, with more than 318 research papers carefully surveyed. We present the taxonomy of various DeepFake generation methods and the categorization of various DeepFake detection methods, and more importantly, we showcase the battleground between the two parties with detailed interactions between the adversaries (DeepFake generation) and the defenders (DeepFake detection). The battleground allows fresh perspective into the latest landscape of the DeepFake research and can provide valuable analysis towards the research challenges and opportunities as well as research trends and future directions. We also elaborately design interactive diagrams (http://www.xujuefei.com/dfsurvey) to allow researchers to explore their own interests on popular DeepFake generators or detectors.

  • Research Article
  • Cite Count Icon 1
  • 10.5753/jis.2024.4120
Learning Self-distilled Features for Facial Deepfake Detection Using Visual Foundation Models: General Results and Demographic Analysis
  • Jul 9, 2024
  • Journal on Interactive Systems
  • Yan Martins Braz Gurevitz Cunha + 6 more

Modern deepfake techniques produce highly realistic false media content with the potential for spreading harmful information, including fake news and incitements to violence. Deepfake detection methods aim to identify and counteract such content by employing machine learning algorithms, focusing mainly on detecting the presence of manipulation using spatial and temporal features. These methods often utilize Foundation Models trained on extensive unlabeled data through self-supervised approaches. This work extends previous research on deepfake detection, focusing on the effectiveness of these models while also considering biases, particularly concerning age, gender, and ethnicity, for ethical analysis. Experiments with DINOv2, a novel Vision Transformer-based Foundation Model, trained using the diverse Deepfake Detection Challenge Dataset, which encompasses several lighting conditions, resolutions, and demographic attributes, demonstrated improved deepfake detection when combined with a CNN classifier, with minimal bias towards these demographic characteristics.

  • Research Article
  • Cite Count Icon 12
  • 10.3390/electronics13010126
Improving Detection of DeepFakes through Facial Region Analysis in Images
  • Dec 28, 2023
  • Electronics
  • Fatimah Alanazi + 2 more

In the evolving landscape of digital media, the discipline of media forensics, which encompasses the critical examination and authentication of digital images, videos, and audio recordings, has emerged as an area of paramount importance. This heightened significance is predominantly attributed to the burgeoning concerns surrounding the proliferation of DeepFakes, which are highly realistic and manipulated media content, often created using advanced artificial intelligence techniques. Such developments necessitate a profound understanding and advancement in media forensics to ensure the integrity of digital media in various domains. Current research endeavours are primarily directed towards addressing a common challenge observed in DeepFake datasets, which pertains to the issue of overfitting. Many suggested remedies centre around the application of data augmentation methods, with a frequently adopted strategy being the incorporation of random erasure or cutout. This method entails the random removal of sections from an image to introduce diversity and mitigate overfitting. Generating disparities between the altered and unaltered images serves to inhibit the model from excessively adapting itself to individual samples, thus leading to more favourable results. Nonetheless, the stochastic nature of this approach may inadvertently obscure facial regions that harbour vital information necessary for DeepFake detection. Due to the lack of guidelines on specific regions for cutout, most studies use a randomised approach. However, in recent research, face landmarks have been integrated to designate specific facial areas for removal, even though the selection remains somewhat random. Therefore, there is a need to acquire a more comprehensive insight into facial features and identify which regions hold more crucial data for the identification of DeepFakes. In this study, the investigation delves into the data conveyed by various facial components through the excision of distinct facial regions during the training of the model. The goal is to offer valuable insights to enhance forthcoming face removal techniques within DeepFake datasets, fostering a deeper comprehension among researchers and advancing the realm of DeepFake detection. Our study presents a novel method that uses face cutout techniques to improve understanding of key facial features crucial in DeepFake detection. Moreover, the method combats overfitting in DeepFake datasets by generating diverse images with these techniques, thereby enhancing model robustness. The developed methodology is validated against publicly available datasets like FF++ and Celeb-DFv2. Both face cutout groups surpassed the Baseline, indicating cutouts improve DeepFake detection. Face Cutout Group 2 excelled, with 91% accuracy on Celeb-DF and 86% on the compound dataset, suggesting external facial features’ significance in detection. The study found that eyes are most impactful and the nose is least in model performance. Future research could explore the augmentation policy’s effect on video-based DeepFake detection.

  • Research Article
  • 10.55041/ijsrem47208
Fake Media Forensics:AI – Driven Forensic Analysis of Fake Multimedia Content
  • May 7, 2025
  • INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
  • Deepak Naik

Abstract—With the rapid advancement of deep learning techniques, the generation of synthetic media—commonly Research and development on deepfakes technology have reached new levels of sophistication. Digital security along with misinformation face serious threats because of these sophisticated methods. and privacy. Existing deepfake detection models primarily the detection methods primarily analyze either video or audio or image-based forgeries yet they seldom employ unified multi-modal examination methods. The authors introduce here a multi-modal deepfake detection system. The proposed framework demonstrates competency in detecting video manipulations as well as synthesized speech and AI- generated images. Our approach the detection framework links deep neural networks known as CNNs together with Transformers are combined with CNNs to identify discrepancies between several input modalities which results in better detection precision. The implementation includes Explainable AI (XAI) techniques for our framework. The approach enhances model interpretability by identifying major traces of forgery through XAI techniques. artifacts such as unnatural facial expressions, lip-sync mismatches, and audio waveform abnormalities. Self- supervised learning with a built-in detection of evolving adversarial attacks is integrated in our system model. Through its learning capability the system develops the ability to handle newly emerging techniques. deepfake generation techniques without explicit retraining. The proposed work introduces a blockchain-based system for forensic purposes. A system offering content authenticity through secure metadata verification of media files enables forensic verification of data authenticity. Our experimental results demonstrate a significant improvement in detection accuracy and these deepfake detection models outperform other standalone deepfake systems due to their enhanced robustness capabilities. This study creates foundations which enable real-time implementation. scalable, and explainable deepfake detection solutions, crucial A network- based forensic system exists to fight against the mounting threats posed by AI-generated media. manipulation. Keywords— Deepfake detection, AI-generated media, Video forensics, Audio forensics, Explainable AI, Real-time detection, Blockchain authentication.

  • Book Chapter
  • Cite Count Icon 17
  • 10.1007/978-3-031-06365-7_22
Understanding the Security of Deepfake Detection
  • Jan 1, 2022
  • Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
  • Xiaoyu Cao + 1 more

Deepfakes pose growing challenges to the trust of information on the Internet. Thus, detecting deepfakes has attracted increasing attentions from both academia and industry. State-of-the-art deepfake detection methods consist of two key components, i.e., face extractor and face classifier, which extract the face region in an image and classify it to be real/fake, respectively. Existing studies mainly focused on improving the detection performance in non-adversarial settings, leaving security of deepfake detection in adversarial settings largely unexplored. In this work, we aim to bridge the gap. In particular, we perform a systematic measurement study to understand the security of the state-of-the-art deepfake detection methods in adversarial settings. We use two large-scale public deepfakes data sources including FaceForensics++ and Facebook Deepfake Detection Challenge, where the deepfakes are fake face images; and we train state-of-the-art deepfake detection methods. These detection methods can achieve 0.94--0.99 accuracies in non-adversarial settings on these datasets. However, our measurement results uncover multiple security limitations of the deepfake detection methods in adversarial settings. First, we find that an attacker can evade a face extractor, i.e., the face extractor fails to extract the correct face regions, via adding small Gaussian noise to its deepfake images. Second, we find that a face classifier trained using deepfakes generated by one method cannot detect deepfakes generated by another method, i.e., an attacker can evade detection via generating deepfakes using a new method. Third, we find that an attacker can leverage backdoor attacks developed by the adversarial machine learning community to evade a face classifier. Our results highlight that deepfake detection should consider the adversarial nature of the problem.

  • Conference Article
  • 10.1109/icpr56361.2022.9956333
Contrastive Knowledge Transfer for Deepfake Detection with Limited Data
  • Aug 21, 2022
  • Dongze Li + 3 more

Nowadays forensics methods have shown remarkable progress in detecting maliciously crafted fake images. However, without exception, the training process of deepfake detection models requires a large number of facial images. These models are usually unsuitable for real world applications because of their overlarge size and inferiority in speed. Thus, performing data-efficient deepfake detection is of great importance. In this paper, we propose a contrastive distillation method that maximizes the lower bound of mutual information between the teacher and the student to further improve student's accuracy in a data-limited setting. We observe that models performing deepfake detection, different from other image classification tasks, have shown high robustness when there is a drop in data amount. The proposed knowledge transfer approach is of superior performance compared with vanilla few samples training baseline and other SOTA knowledge transfer methods. We believe we are the first to perform few-sample knowledge distillation on deepfake detection.

  • Research Article
  • Cite Count Icon 11
  • 10.1007/s10462-025-11286-8
A self-supervised BEiT model with a novel hierarchical patchReducer for efficient facial deepfake detection
  • Jun 12, 2025
  • Artificial Intelligence Review
  • Aneesa Al Redhaei + 2 more

The spread of deepfake technology has become a growing concern, especially with the rapid advancement of generative models. This progress has made it increasingly difficult to distinguish between real and fake facial videos. This poses serious threats to security, privacy, and the spread of misinformation. Despite the existence of deepfake detection models, many suffer from high computational costs. This makes them impractical for deployment in resource-constrained environments. To address this challenge, this paper proposes a solution that accurately identifies deepfakes while reducing computational complexity. To achieve this, a deepfake detection system using an improved version of a Self-Supervised BEiT, called BEiT-HPR (Hierarchical PatchReducer), is proposed. The enhancement adds a Hierarchical PatchReducer layer to reduce the number of patches in successive encoder blocks. This reduces computational complexity while maintaining high detection accuracy. Additionally, training speed increases by over 50%, while the number of parameters is reduced by 63.4%. The BEiT-HPR model was tested using three publicly available benchmark datasets: FaceForensics++ (FF++), Celeb-DF, and the Deepfake Detection Dataset (DFD). The evaluation results revealed that reducing the model complexity by 62.2% allowed the proposed model to achieve an accuracy of 83.92% on FF++, 97.59% on Celeb-DF, and 98.25% on DFD. These findings emphasize the importance of computationally efficient deepfake detection methods that maintain high detection accuracy while reducing the burden of heavy computation. Therefore, it offers a scalable solution for identifying deepfakes across diverse datasets.

  • Research Article
  • Cite Count Icon 4
  • 10.1155/hbe2/1833228
Human Performance in Deepfake Detection: A Systematic Review
  • Jan 1, 2025
  • Human Behavior and Emerging Technologies
  • Klaire Somoray + 2 more

Deepfakes refer to a wide range of computer‐generated synthetic media, in which a person’s appearance or likeness is altered to resemble that of another. This systematic review is aimed at providing an overview of the existing research into people’s ability to detect deepfakes. Five databases (IEEE, ProQuest, PubMed, Web of Science, and Scopus) were searched up to December 2023. Studies were included if they (1) were an original study; (2) were reported in English; (3) examined people’s detection of deepfakes; (4) examined the influence of an intervention, strategy, or variable on deepfake detection; and (5) reported relevant data needed to evaluate detection accuracy. Forty independent studies from 30 unique records were included in the review. Results were narratively summarized, with key findings organized based on the review’s research questions. Studies used different performance measures, making it difficult to compare results across the literature. Detection accuracy varies widely, with some studies showing humans outperforming AI models and others indicating the opposite. Detection performance is also influenced by person‐level (e.g., cognitive ability, analytical thinking) and stimuli‐level factors (e.g., quality of deepfake, familiarity with the subject). Interventions to improve people’s deepfake detection yielded mixed results. Humans and AI‐based detection models focus on different aspects when detecting, suggesting a potential for human–AI collaboration. The findings highlight the complex interplay of factors influencing human deepfake detection and the need for further research to develop effective strategies for deepfake detection.

Save Icon
Up Arrow
Open/Close