Abstract

The continuous developments of urban and industrial environments have increased the demand for intelligent video surveillance. Deep learning has achieved remarkable performance for anomaly detection in surveillance videos. Previous approaches achieve anomaly detection with a single-pretext task (image reconstruction or prediction) and detect anomalies by larger reconstruction error or poor prediction. However, they cannot fully exploit the discriminative semantics and temporal context information. Moreover, tackling anomaly detection with a single pretext task is suboptimal due to the nonalignment between the pretext task and anomaly detection. In this article, we propose a temporal-aware contrastive network (TAC-Net) to address the abovementioned problems of anomaly detection for intelligence video surveillance. TAC-Net is an unsupervised method that utilizes deep contrastive self-supervised learning to capture the high-level semantic features and tackles anomaly detection with multiple self-supervised tasks. During inference phase, the multiple task losses and contrastive similarity are utilized to calculate the anomaly score. Experimental results show that our method is superior to state-of-the-art approaches on three benchmarks, which demonstrates the validity and advancement of TAC-Net.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.