Abstract
Nowadays, faces in videos can be easily replaced with the development of deep learning, and these manipulated videos are realistic and cannot be distinguished by human eyes. Some people maliciously use the technology to attack others, especially celebrities and politicians, causing destructive social impacts. Therefore, it is imperative to design an accurate method for detecting face manipulation. However, most of the existing methods adopt single convolutional neural network as the feature extraction module, causing the extracted features to be inconsistent with the human visual mechanism. Moreover, the rich details and semantic information cannot be reflected with single feature, limiting the detection performance. Therefore, this paper tackles the above problems by proposing a novel face manipulation detection method based on a supervised multi-feature fusion attention network (SMFAN). Specifically, the capsule network is used for face manipulation detection, and the SMFAN is added to the original capsule network to extract details of the fake face image. Further, the focal loss is used to realize hard example mining. Finally, the experimental results on the public dataset FaceForensics++ show that the proposed method has better performance.
Highlights
Face manipulation technology is a novel way to replace human faces in videos
Using face manipulation technology to make fake videos will undoubtedly ruin the image of celebrities and politicians, for instance, the manipulated video of U.S democratic leader Nancy Pelosi getting drunk that Trump shared on Facebook aroused great attention on the Internet
This paper proposes to use supervised multi-feature fusion attention network (SMFAN) to optimize the feature extraction module of capsule network, and use focal loss to replace the cross-entropy loss in the original network
Summary
Face manipulation technology is a novel way to replace human faces in videos. The realization of face manipulation is getting easier, benefiting from the development of convolutional neural networks (CNN) [1] and generative adversarial nets (GANs) [2]. Face manipulation methods can be divided into two categories: face synthesis and face swap. Face synthesis mainly generates non-existent but incredibly realistic human faces through GAN (such as Cycle-Consistent Adversarial Networks (CycleGAN) [3] and Star Generative Adversarial Nets (StarGAN) [4]). Face expression swap, whose popular ways are Face2Face and FaceApp, modifies facial expressions, such as changing the crying expression into the laugh. It is necessary to design a method to detect all the manipulation types
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.