Abstract

Anomaly detection in surveillance videos aims to identify frames where abnormal events happen. Existing approaches assume that the training and testing videos are from the same scene, exhibiting poor generalization performance when encountering an unseen scene. In this paper, we propose a Variational Anomaly Detection Network (VADNet), which is characterized by its high scene-adaptation - it can identify abnormal events in a new scene only via referring to a few normal samples without fine-tuning. Our model embodies two major innovations. First, a novel Variational Normal Inference (VNI) module is proposed to formulate image reconstruction in a conditional variational auto-encoder (CVAE) framework, which learns a probabilistic decision model instead of a traditional deterministic one. Secondly, a Margin Learning Embedding (MLE) module is leveraged to boost the variational inference and aid in distinguishing normal events. We theoretically demonstrate that minimizing the triplet loss in MLE module facilitates maximizing the evidence lower bound (ELBO) of CVAE, which promotes the convergence of VNI. By incorporating variational inference with margin learning, VADNet becomes much more generative that is able to handle the uncertainty caused by the changed scene and limited reference data. Extensive experiments on several datasets demonstrate that the proposed VADNet can adapt to a new scene effectively without fine-tuning and achieve remarkable performance, which outperforms other methods significantly and establishes new state-of-the-art in the case of few-shot scene-adaptive anomaly detection. We believe our method is closer to real-world application due to its strong generalization ability. All codes are released in https://github.com/huangxx156/VADNet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call