The image manipulation detection localization task differs from traditional computer vision tasks in that we focus more on capturing subtle and generic manipulation detection features in images. In this paper, we propose a novel method called irrelevant visual information suppression, which aims to alleviate the interference of irrelevant visual information in images on manipulation detection feature extraction, thereby obtaining generic manipulation traces that are more subtle and unrelated to semantic visual information. In general, most manipulation operations leave traces at manipulation edges. Therefore, we introduce a specially designed manipulated edge information enhancement branch aimed at identifying these edge artifacts more accurately. We construct a dual-branch network, where each branch uses ResNet-50 as the backbone to capture as many multi-scale manipulation features as possible. Finally, we adopt a multi-view feature learning method that combines the manipulated edge information enhancement branch with the irrelevant visual information suppression branch and is trained with multi-scale (pixel/edge/image/irrelevant visual information suppression) supervision. To validate the effectiveness of the proposed method, we conducted extensive experiments using five image manipulation localization datasets, including CASIAv1, CASIAv2, COVER, Columbia, and NIST16. The experimental results demonstrate that our proposed method can outperform state-of-the-art methods by a significant margin in terms of F1 score. Taking CASIAv1, COVER, and Columbia datasets as examples, compared with MVSS-Net published in ICCV 2021, our method has improved F1 scores by 7.1%, 6.3%, and 12.5%, respectively. The code used in this paper can be found at the following URL: https://github.com/ginwins/ISIE-Net .
Read full abstract