The rapid development of the Generative Adversarial Network (GAN) makes generated face images more and more visually indistinguishable, and the detection performance of previous methods will degrade seriously when the testing samples are out-of-sample datasets or have been post-processed. To address the above problems, we propose a new relational embedding network based on “what to observe” and “where to attend” from a relational perspective for the task of generated face detection. In addition, we designed two attention modules to effectively utilize global and local features. Specifically, the dual-self attention module selectively enhances the representation of local features through both image space and channel dimensions. The cross-correlation attention module computes similarity between images to capture the global information of the output in the image. We conducted extensive experiments to validate our method, and the proposed algorithm can effectively extract the correlations between features and achieve satisfactory generalization and robustness in generating face detection. In addition, we also explored the design of the model structure and the inspection performance on more categories of generated images (not limited to faces). The results show that RENet also has good detection performance on datasets other than faces.
Read full abstract