Adversarial and focused training of abnormal videos for weakly-supervised anomaly detection

Ping He,Fan Zhang,Gang Li,Huibin Li

doi:10.1016/j.patcog.2023.110119

Abstract

Due to the sparsity and scarcity of abnormal events, intra-video and inter-video data imbalance problems are fundamental issues for the weakly supervised video anomaly detection (WS-VAD) task. Many previous works have made great progress in the intra-video data imbalance problem while lacking attention to the inter-video case. However, we find that when reducing the number of abnormal videos used for training, the performance of some existing state-of-the-art WS-VAD methods will be decreased. To alleviate this problem, we propose a novel solution by adversarial and focused training (AFT) of abnormal videos. Specifically, our solution consists of two modules. One is a data-based adversarial training (AT) module that performs data augmentation through latent space-based adversarial sample generation of abnormal videos, and the other is a model-based focused training (FT) module that focuses on the cost-sensitive loss of abnormal videos. Once the whole pipeline has been trained, a score-level late fusion strategy is employed to combine the abnormal scores of both adversarial training and focused training modules in the testing phase. The effectiveness of the proposed approach is demonstrated on UCF-Crime, ShanghaiTech, XD-Violence, and UCSD Peds datasets in both the inter-video data imbalanced experimental setting and the original experimental setting. The source code is available at:https://github.com/Destind/AFT_codes.

Full Text