Modern image editing software enables anyone to alter the content of an image to deceive the public, which can pose a security hazard to personal privacy and public safety. The detection and localization of image tampering is becoming an urgent issue to be addressed. We have revealed that the tampered region exhibits homogenous differences (the changes in metadata organization form and organization structure of the image) from the real region after manipulations such as splicing, copy-move, and removal. Therefore, we propose a novel end-to-end network named HDF-Net to extract these homogeny difference features for precise localization of tampering artifacts. The HDF-Net is composed of RGB and SRM dual-stream networks, including three complementary modules, namely the suspicious tampering-artifact prominent (STP) module, the fine tampering-artifact salient (FTS) module, and the tampering-artifact edge refined (TER) module. We utilize the fully attentional block (FLA) to enhance the characterization ability of homogeny difference features extracted by each module and preserve the specifics of tampering artifacts. These modules are gradually merged according to the strategy of "coarse-fine-finer", which significantly improves the localization accuracy and edge refinement. Extensive experiments demonstrate that HDF-Net performs better than state-of-the-art tampering localization models on five benchmarks, achieving satisfactory generalization and robustness.
Read full abstract