DFDT: An End-to-End DeepFake Detection Framework Using Vision Transformer

Aminollah Khormali,Jiann-Shiun Yuan

doi:10.3390/app12062953

Abstract

The ever-growing threat of deepfakes and large-scale societal implications has propelled the development of deepfake forensics to ascertain the trustworthiness of digital media. A common theme of existing detection methods is using Convolutional Neural Networks (CNNs) as a backbone. While CNNs have demonstrated decent performance on learning local discriminative information, they fail to learn relative spatial features and lose important information due to constrained receptive fields. Motivated by the aforementioned challenges, this work presents DFDT, an end-to-end deepfake detection framework that leverages the unique characteristics of transformer models, for learning hidden traces of perturbations from both local image features and global relationship of pixels at different forgery scales. DFDT is specifically designed for deepfake detection tasks consisting of four main components: patch extraction & embedding, multi-stream transformer block, attention-based patch selection followed by a multi-scale classifier. DFDT’s transformer layer benefits from a re-attention mechanism instead of a traditional multi-head self-attention layer. To evaluate the performance of DFDT, a comprehensive set of experiments are conducted on several deepfake forensics benchmarks. Obtained results demonstrated the surpassing detection rate of DFDT, achieving 99.41%, 99.31%, and 81.35% on FaceForensics++, Celeb-DF (V2), and WildDeepfake, respectively. Moreover, DFDT’s excellent cross-dataset & cross-manipulation generalization provides additional strong evidence on its effectiveness.

Highlights

IntroductionAdversarial Networks (GANs) [1,2] and the abundance of training samples, along with robust computational resources [3], have significantly propelled the field of Artificial Intelligence (AI)-generated fake information in all kinds, e.g., deepfakes
The recent advances in the field of Artificial Intelligence (AI), GenerativeAdversarial Networks (GANs) [1,2] and the abundance of training samples, along with robust computational resources [3], have significantly propelled the field of AI-generated fake information in all kinds, e.g., deepfakes
The ever-growing threat of deepfakes and large-scale societal implications have driven the development of deepfake forensics to ascertain the trustworthiness and authenticity of digital media

Summary

Introduction

Adversarial Networks (GANs) [1,2] and the abundance of training samples, along with robust computational resources [3], have significantly propelled the field of AI-generated fake information in all kinds, e.g., deepfakes. Deepfake generation algorithms are constantly evolving and have become a bullet point for adversarial entities to perpetuate and disseminate criminal content in various forms, including ransomware, digital kidnapping, etc. The fact that deepfakes are GAN-generated digital content and not actual events captured by a camera implies that they still can be detected using advanced AI models [13]. It has been proven that deep neural networks tend to achieve better performance than traditional image forensic tools [9]. Typical components of most state-of-the-art deepfake detection approaches are convolutional neural networks, and facial regions cropped out of an entire image [14–16]. CNNs have proven themselves solid candidates for learning local information of the image, they still miss capturing pixels’ spatial interdependence due to constrained receptive fields

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied sciences	Publication Date: Mar 14, 2022
Citations: 15	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

DFDT: An End-to-End DeepFake Detection Framework Using Vision Transformer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences

Lead the way for us

Similar Papers

Fashion clothing matching by global-local feature optimization
Wang Yunzhu ... Huang Qingsong
Journal of Image and Graphics | VOL. 28
Wang Yunzhu, et. al.Wang Yunzhu ... Huang Qingsong
01 Jan 2023
Journal of Image and Graphics | VOL. 28

Deepfake Detection using Integrate-backward-integrate Logic Optimization Algorithm with CNN
R Uma Maheshwari ... Dr Sankar Ganesh S
International Journal of Electrical and Electronics Research | VOL. 12
R Uma Maheshwari, et. al.R Uma Maheshwari ... Dr Sankar Ganesh S
28 Jun 2024
International Journal of Electrical and Electronics Research | VOL. 12

Deep Single Image Enhancer
Mengchen Lin ... Jie Yang
-
Mengchen Lin, et. al.Mengchen Lin ... Jie Yang
01 Sep 2019
01 Sep 2019

ResNet with Global and Local Image Features, Stacked Pooling Block, for Semantic Segmentation
Hui Song ... Zixuan Yang
-
Hui Song, et. al.Hui Song ... Zixuan Yang
01 Aug 2018
01 Aug 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DFDT: An End-to-End DeepFake Detection Framework Using Vision Transformer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences