Image Manipulation Localization Using Attentional Cross-Domain CNN Features.

Shuaibo Li,Wei Ma,Qiu Zong,Shibiao Xu

doi:10.1109/tnnls.2021.3130168

Abstract

Along with the advancement of manipulation technologies, image modification is becoming increasingly convenient and imperceptible. To tackle the challenging image tampering detection problem, this article presents an attentional cross-domain deep architecture, which can be trained end-to-end. This architecture is composed of three convolutional neural network (CNN) streams to extract three types of features, including visual perception, resampling, and local inconsistency features, from spatial and frequency domains. The multitype and cross-domain features are then combined to formulate hybrid features to distinguish manipulated regions from nonmanipulated parts. Compared with other deep architectures, the proposed one spans a more complementary and discriminative feature space by integrating richer types of features from different domains in a unified end-to-end trainable framework and thus can better capture artifacts caused by different types of manipulations. In addition, we design and train a module called tampering discriminative attention network (TDA-Net) to highlight suspicious parts. These part-level representations are then integrated with the global ones to further enhance the discriminating capability of the hybrid features. To adequately train the proposed architecture, we synthesize a large dataset containing various types of manipulations based on DRESDEN and COCO. Experiments on four public datasets demonstrate that the proposed model can localize various manipulations and achieve the state-of-the-art performance. We also conduct ablation studies to verify the effectiveness of each stream and the TDA-Net module.

Full Text