Abstract

Recent studies have demonstrated the applicability of classification models based on deep learning for detecting human action. For this reason, automatic violence detection from videos has become a crying need to prevent the spreading of vulnerable content on digital platforms. In spite of the remarkable success of these neural networks, they are prone to fail against adversarial attacks and thus, highlighted the need for evaluating the robustness of these state-of-the-art violence detection classifiers. Here, we propose a transferable logit attack for binary misclassification of video data which can evade the system by our spatially perturbed synthesized adversarial samples. We utilize the adversarial falsification-based threat model to validate a non-sparse white-box attack setting which will generate cross-domain adversarial video samples by perturbing only spatial features without affecting the temporal features. We carry out extensive experiments on the validation set of two popular violence detection datasets: Hockey Fight Dataset, and Movie Dataset and verify that our proposed attack method has high attack success rate for these datasets against the state-of-the-art violence detection classifier. This work aims to make future violence detection models more resistant to adversarial examples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call