Abstract

Anomaly detection in videos is challenging due to the scarcity and variance in positive samples. Current anomaly detection methods can be categorized into reconstruction models and future frame prediction-based models. However, reconstruction models might be exceptionally adapted to abnormal events due to the learning capacity and generalization ability of deep neural networks, whereas prediction-based methods can be sensitive to noise. In this study, we propose an anomaly detection model based on the latent feature space, which combines advantages from both sides. We argue that the constraints in the latent feature space can promote reconstruction; moreover, the optical flow is also considered to introduce temporal information. We use SPyNet for accurate and efficient optical flow estimation. We extensively validate our method on the UCSD Ped1, UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets. The results demonstrated the feasibility of the proposed method and the benefit of utilizing information in the latent feature space.

Highlights

  • Video anomaly detection is an important research field in computer vision

  • Inspired by the literature [11], [13], this paper proposes an anomaly detection model based on latent feature constraints

  • This article combines the prediction module and the reconstruction module in the generative adversarial network training framework and imposes latent feature constraints and SPyNet constraints in the reconstruction module, minimizes the reconstruction error value of the image and latent feature vectors on the reconstructed frame, which helps the model learn according to the normal distribution and completes better reconstruction work

Read more

Summary

INTRODUCTION

Video anomaly detection is an important research field in computer vision. Typically, samples with normal behavior represent the majority of the dataset, whereas only limited abnormal samples are available. Anomaly detection based on predicting future frames attempts to improve the defects of the reconstruction method It defines the unexpected event as an abnormal event, inputs consecutive frames, and, through some constraints, forces the future frames to be consistent with the ground truth. It is impossible to obtain an accurate reconstruction error value for the test video with considerable noise, which causes misjudgment of normal frames It is not competent for anomaly detection in more complex monitoring scenarios. This article combines the prediction module and the reconstruction module in the generative adversarial network training framework and imposes latent feature constraints and SPyNet constraints in the reconstruction module, minimizes the reconstruction error value of the image and latent feature vectors on the reconstructed frame, which helps the model learn according to the normal distribution and completes better reconstruction work. When some abnormal behavior occurs, such as someone riding a bicycle, the prediction is blurred

EXPERIMENTS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call