Abstract

Anomaly detection in the video has recently gained attention due to its importance in the intelligent surveillance system. Even though the performance of the state-of-art methods has been competitive in the benchmark dataset, the trade-off between the computational resource and the accuracy of the anomaly detection should be considered. In this paper, we present a framework to detect anomalies in video. We proposed a “multi-scale U-Net” network architecture, the unsupervised learning for anomaly detection in video based on generative adversarial network (GAN) structure. Shortcut Inception Modules (SIMs) and residual skip connection are employed to the generator network to increase the ability of the training and testing of the neural network. An asymmetric convolution has been applied instead of traditional convolution layers to decrease the number of training parameters without performance penalty in terms of detection accuracy. In the training phase, the generator network was trained to generate the normal events and attempt to make the generated image and the ground truth to be similar. A multi-scale U-Net kept useful features of an image that were lost during training caused by the convolution operator. The generator network is trained by minimizing the reconstruction error on the normal data and then using the reconstruction error as an indicator of anomalies in the testing phase. Our proposed framework has been evaluated on three benchmark datasets, including UCSD pedestrian, CHUK Avenue, and ShanghaiTech. As a result, the proposed framework surpasses the state-of-the-art learning-based methods on all these datasets, which achieved 95.7%, 86.9%, and 73.0% in terms of AUC. Moreover, the numbers of training and testing parameters in our framework are reduced compared to the baseline network architecture, while the detection accuracy is still improved.

Highlights

  • Videos from Closed-Circuit Television (CCTV) cameras are rapidly generated every minute in accordance with an increasing number of cameras in public places in order to increase the efficiency, safety, and security due to criminal and terrorist attacks

  • Deep learning has been proposed for anomaly detection [9], [10], [12]–[19] including supervised and unsupervised learning, the method based on deep learning has improved the accuracy as well as reducing the false alarm rate

  • We presented a framework that consists of a multi-scale generator network and residual skip connection to make a network learn higher-level features of images

Read more

Summary

INTRODUCTION

Videos from Closed-Circuit Television (CCTV) cameras are rapidly generated every minute in accordance with an increasing number of cameras in public places in order to increase the efficiency, safety, and security due to criminal and terrorist attacks. The challenging to detect an anomaly event is to distinguish the pattern of object movement, i.e. normal or anomaly, since the video scene captured by surveillance cameras may incur movement over the time. Several anomaly detection approaches based on either a convolution autoencoder (ConvAE) [9]–[11] or a U-Net [12] are performed in a different way to detect an anomaly These approaches learn the normal patterns from training videos and detecting abnormal events which would not correspond to the learned model [10], [13]–[15]. We proposed a framework for video anomaly detection using GAN structure. The proposed multi-scale U-Net reduces the parameter number of training and testing, while the anomaly detection accuracy still significantly improves.

RELATED WORKS
RESIDUAL SKIP CONNECTION
OBJECTIVE FUNCTIONS
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.