Abstract

Video anomaly detection (VAD) refers to the discrimination of unexpected events in videos. The deep generative model (DGM)-based method learns the regular patterns on normal videos and expects the learned model to yield larger generative errors for abnormal frames. However, DGM cannot always do so, since it usually captures the shared patterns between normal and abnormal events, which results in similar generative errors for them. In this article, we propose a novel self-supervised framework for unsupervised VAD to tackle the above-mentioned problem. To this end, we design a novel self-supervised attentive generative adversarial network (SSAGAN), which is composed of the self-attentive predictor, the vanilla discriminator, and the self-supervised discriminator. On the one hand, the self-attentive predictor can capture the long-term dependences for improving the prediction qualities of normal frames. On the other hand, the predicted frames are fed to the vanilla discriminator and self-supervised discriminator for performing true-false discrimination and self-supervised rotation detection, respectively. Essentially, the role of the self-supervised task is to enable the predictor to encode semantic information into the predicted normal frames via adversarial training, in order for the angles of rotated normal frames can be detected. As a result, our self-supervised framework lessens the generalization ability of the model to abnormal frames, resulting in larger detection errors for abnormal frames. Extensive experimental results indicate that SSAGAN outperforms other state-of-the-art methods, which demonstrates the validity and advancement of SSAGAN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call