Abstract

With the recent rise of realistic face manipulation methods, building robust face tampering detection methods has become more critical than ever before. Several research works have focussed on extracting multi-scale features to enhance the feature learning process. However, most of such works suffer from a design flaw of combining multiple scale information in equal proportion. This is not the best approach as a feature from one scale could be more important than other scale features. To this end, a novel deepfake detection architecture, Face-NeSt has been proposed. Face-NeSt has the unique ability to choose an ideal proportion of multi-scale features, best suited for the final prediction. Specifically, Face-NeSt employs a novel ‘adaptively weighted multi-scale attentional’ (AW-MSA) module that is capable of choosing the best proportion of multi-scale features. Face-NeSt uses an attention mechanism that allows it to highlight important feature regions along the spatial and channel dimension, both locally and globally. Unlike the popular computer vision models of recent times, Face-NeSt is designed to be computationally light-weight. Face-NeSt performs admirably on three publicly available benchmark datasets, FaceForensics++ (FF++), CelebDF and Deep Fake Detection Challenge (DFDC). The AUC scores are 0.9823 on CelebDF, 0.9947 on DFDC, 0.9945 on DeepFake (FF++), 0.9905 on Face2Face (FF++), 0.9978 on FaceShifter (FF++), 0.9948 on FaceSwap (FF++) and 0.9548 on NeuralTextures (FF++). These excellent findings highlight Face-NeSt's efficacy, since it easily outperforms all state-of-the-art (SOTA) approaches for facial tampering detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call