Abstract

Nowadays, faces in videos can be easily replaced with the development of deep learning, and these manipulated videos are realistic and cannot be distinguished by human eyes. Some people maliciously use the technology to attack others, especially celebrities and politicians, causing destructive social impacts. Therefore, it is imperative to design an accurate method for detecting face manipulation. However, most of the existing methods adopt single convolutional neural network as the feature extraction module, causing the extracted features to be inconsistent with the human visual mechanism. Moreover, the rich details and semantic information cannot be reflected with single feature, limiting the detection performance. Therefore, this paper tackles the above problems by proposing a novel face manipulation detection method based on a supervised multi-feature fusion attention network (SMFAN). Specifically, the capsule network is used for face manipulation detection, and the SMFAN is added to the original capsule network to extract details of the fake face image. Further, the focal loss is used to realize hard example mining. Finally, the experimental results on the public dataset FaceForensics++ show that the proposed method has better performance.

Highlights

  • Face manipulation technology is a novel way to replace human faces in videos

  • Using face manipulation technology to make fake videos will undoubtedly ruin the image of celebrities and politicians, for instance, the manipulated video of U.S democratic leader Nancy Pelosi getting drunk that Trump shared on Facebook aroused great attention on the Internet

  • This paper proposes to use supervised multi-feature fusion attention network (SMFAN) to optimize the feature extraction module of capsule network, and use focal loss to replace the cross-entropy loss in the original network

Read more

Summary

Introduction

Face manipulation technology is a novel way to replace human faces in videos. The realization of face manipulation is getting easier, benefiting from the development of convolutional neural networks (CNN) [1] and generative adversarial nets (GANs) [2]. Face manipulation methods can be divided into two categories: face synthesis and face swap. Face synthesis mainly generates non-existent but incredibly realistic human faces through GAN (such as Cycle-Consistent Adversarial Networks (CycleGAN) [3] and Star Generative Adversarial Nets (StarGAN) [4]). Face expression swap, whose popular ways are Face2Face and FaceApp, modifies facial expressions, such as changing the crying expression into the laugh. It is necessary to design a method to detect all the manipulation types

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call