Abstract

Deep learning algorithms have shown great performance in multimedia forensics applications using supervised learning on large-scale labeled datasets. However, constructing such extensive labeled datasets can be challenging and costly in several multimedia forensics scenarios. Additionally, heavyweight deep learning models with complex architectures and a large number of parameters require significant hardware resources for training. To address these challenges in the context of image cropping detection, a common multimedia forensics application, we propose a semi-supervised deep learning framework capable of training on a large amount of unlabeled image samples. In this framework, we leverage a teacher model, trained on a small set of labeled image samples, to rank the confidence scores of image samples in a large-scale unlabeled dataset. By utilizing the ranked image samples, we train a student network successfully. To validate the effectiveness of our collaborative training framework across various image cropping detection scenarios, we conduct extensive experiments on a large-scale dataset. The experimental results clearly demonstrate that our semi-supervised learning approach achieved a state-of-the-art performance compared to existing supervised detection frameworks, achieving an accuracy of 91.79% on the BOSSbase dataset and 89.23% on the Alaska dataset. Furthermore, we conducted in-depth research on various factors that influence detection performance in the context of semi-supervised learning. These factors include pairings of teacher–student models, the top-K selection approach, the number of unlabeled samples, the number of iterations in self-training, and the proportion of high-confidence samples using in semi-supervised learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call