Abstract

Video anomaly detection (VAD) refers to identifying abnormal events in the surveillance video. Typically, reconstruction based video anomaly detection techniques employ convolutional autoencoders with a limited number of layers, which extracts insufficient features leading to improper network training. To address this challenge, an end-to-end unsupervised feature enhancement network, namely Bi-Residual Convolutional AutoEncoder (Bi-ResCAE) has been proposed that can learn normal events with low reconstruction error and detect anomalies with high reconstruction error. The proposed Bi-ResCAE network incorporates long–short residual connections to enhance feature reusability and training stabilization. In addition, we propose to formulate a novel VAD model that can extract appearance and motion features by fusing both the Bi-ResCAE network and optical flow network in the objective function to recognize the anomalous object in the video. Extensive experiments on three benchmark datasets validate the effectiveness of the model. The proposed model achieves an AUC (Area Under the ROC Curve) of 84.7% on Ped1, 97.7% on Ped2, and 86.71% on the Avenue dataset. The results show that the Bi-READ performs better than state-of-the-art techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call