Abstract

With the development of generative model, the cost of facial manipulation and forgery is becoming lower and lower. Fraudulent data has brought numerous hidden threats in politics, privacy, and cybersecurity. Although many methods of face forgery detection focus on the learning of high frequency forgery traces and achieve promising performance, these methods usually learn features in spatial and frequency independently. In order to combine the information of the two domains, a combined spatial and frequency dual stream network is proposed for face forgery detection. Concretely, a cross self-attention (CSA) module is designed to improve frequency feature interaction and fusion at different scales. Moreover, to augment the semantic and contextual information, frequency guided spatial feature extraction module is proposed to extract and reconstruct the spatial information. These two modules deeply mine the forgery traces via a dual-stream collaborative network. Through comprehensive experiments on different datasets, we demonstrate the effectiveness of proposed method for both within and cross datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call