Abstract

Recently, the spread of videos forged by deepfake tools has been widely concerning, and effective ways for detecting them are urgently needed. It is known that such artificial intelligence-aided forgery makes at least three levels of artifacts, which can be named as microcosmic or statistical features, mesoscopic features, and macroscopic or semantic features. However, existing detection methods have not been designed to exploited them all. This work proposes a new approach to more effective detection of deepfake videos. A multi-layer fusion neural network (MFNN) has been designed to capture the artifacts in different levels. Features maps output from specially designed shallow, middle, and deep layers, which are used as statistical, mesoscopic, and semantic features, respectively, are fused together before classification. FaceForensic++ dataset was used to train and test the method. The experimental results show that MFNN outperforms other relevant methods. Particularly, it demonstrates more advantage in detecting low-quality deepfake videos.

Highlights

  • Human face is the most significant identity of human beings

  • Results and Comparison with Other Methods The experimental results of multi-layer fusion neural network (MFNN) will be compared with the following existing methods

  • This work proposes a new approach to more accurate detection of face forgery in Deepfake videos

Read more

Summary

INTRODUCTION

Human face is the most significant identity of human beings. Nowadays, digital videos with human faces are widely used in many serious occasions such as court evidence and news report. The hierarchical property of CNNs has made us consider combining feature maps from different layers for more effective detection of Deepfake videos. Based on the above observation, we have designed a so-called multi-layer fusion neural network (MFNN) to better capture different levels of Deepfake forgery features for classification. The learning capability of neural network cannot be fully exploited if we directly combine and send these different-layer feature maps with different characteristics and dimensionalities to the final decision-making part, that is, the fully connected layer. These feature maps need to be further processed at different stages before they arrive at the decision-making part.

EXPERIMENTS
Detection Methods
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.