Abstract

With the development of video and image processing technology, the field of video tampering forensics is facing enormous challenges. Specifically, as the fundamental basis of judicial forensics, passive forensics for object removal video forgery is particularly essential. To extract tampering traces in video more sufficiently, the author proposed a spatiotemporal trident network based on the spatial rich model (SRM) and 3D convolution (C3D), which provides three branches and can theoretically improve the detection and localization accuracy of tampered regions. Based on the spatiotemporal trident network, a temporal detector and a spatial locator were designed to detect and locate the tampered regions in the temporal and spatial domains of videos. For the temporal detector, 3D CNNs were employed in three branches as the encoders and a bidirectional long short-term memory (BiLSTM) as the decoder. For the spatial locator, a backbone network named C3D-ResNet12 was designed as the encoder of the three branches, and the region proposal networks (RPNs) were employed as the decoders in three branches. In addition, we optimized the loss functions of the above two algorithms based on focal loss and GIoU loss. The experimental results revealed the effectiveness of spatiotemporal detection and localization algorithms: for temporal forgery detection, the accuracy of the frame classification increased to 99+%; for spatial forgery localization, the successful localization rate of the tampered regions in forged frames reached 96+%, and the mean intersection over union of the located tampered regions and the real tampered regions reached 62+%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call