Abstract

In recent years, various convolutional neural network (CNN) based frameworks have been presented to detect forged regions in images. However, most of the existing models can not obtain satisfactory performance due to tampered areas with various sizes, especially for objects with large-scale. In order to obtain an accurate object-level forgery localization result, we propose a novel hybrid transformer architecture, which exhibits both advantages of spatial dependencies and contextual information from different scales, namely, TransU <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -Net. Specifically, long-range semantic dependencies are captured by the last block of encoder to locate large-scale tampered areas more completely . Meanwhile, non-semantic features are filtered out by enhancing low-level features under the guidance of high-level semantic information in the skip connections to achieve more refined spatial recovery. Therefore, our hybrid model can locate spliced forgeries with various sizes without requiring large data set pre-training. In comparison with other existing CNN-based methods, our framework achieves better performance over state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call