Abstract

This letter focuses on image manipulation detection which aims to recognize the manipulated regions under the contextual semantic information. Existing approaches usually overlook the semantic discrepancy between different levels of feature maps, and directly fuse (e.g., addition, or concatenation) them for detection. In this letter, we argue that the semantic gap is the main reason for the low effectiveness of feature fusion in manipulation predictions. To address this problem, we propose a Global Semantic Consistency Network (GSCNet) for image manipulation detection, which is based on an encoder-decoder structure. Specifically, to make GSCNet include more global texture information which has been empirically confirmed to be beneficial to manipulation detection, gram block is first deployed on each level of feature maps in the encoding stage. Based on that, bi-directional convolutional LSTM is further implemented on the decoding stage, such that feature maps of the same level have semantic consistency. Experimental results on NIST16, and CASIA v1.0 declare that GSCNet can accurately locate the manipulated regions. Furthermore, compared to the existing models, GSCNet can achieve new state-of-the-art results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call