Abstract

Sustainable cities requires high-quality community management and surveillance analytics, which are supported by video anomaly detection techniques. However, mainstream video anomaly detection techniques still require manually labeled data and do not apply to real-world massive videos. Without labeling, unsupervised video anomaly detection (UVAD) is challenged by the problem of pseudo-labeled noise and the openness of anomaly detection. In response, a diffusion-based latent pattern learning UVAD framework is proposed, called DiffVAD. The method learns potential patterns by generating different patterns of the same event through diffusion models. The detection of anomalies is realized by evaluating the pattern distribution. The different patterns of normal events are diverse but correlated, while the different patterns of abnormal events are more diffuse. This manner of detection is equally effective for unseen normal events in the training set. In addition, we design a refinement strategy for pseudo-labels to mitigate the effects of the noise problem. Extensive experiments on six benchmark datasets demonstrate the design’s promising generalization ability and high efficiency. Specifically, DiffVAD obtains an AUC score of 81.9% on the ShanghaiTech dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.