Abstract

Given normal training samples, anomaly detection in videos can be regarded as a challenging problem of identifying unexpected events. The state-of-the-art approaches generally resort to the autoencoder model by using a single encoder to capture the motion and content patterns jointly. Nevertheless, due to the lack of accurate labels of normal and abnormal samples, how to detect anomalies is decided by the subjective understanding of models. It infers that different models will prefer to mine different patterns according to the characteristics of models. We call this problem as a pattern bias problem. To alleviate this problem, a novel Multi-Encoder Single-Decoder network, termed as MESDnet, is proposed in the spirit of encoding motion and content cues individually with multiple encoders. MESDnet is of end-to-end learning ability and real-time running speed. Particularly, the differences between adjacent frames and the raw frames are used as the motion and content sources, respectively. Then, a decoder takes charge of detecting anomalies in the way of observing reconstructing error towards the video frames by using the multi-stream encoded motion and content features simultaneously. The experiments on the CUHK Avenue dataset, the UCSD Pedestrian dataset, and the ShanghaiTech Campus dataset verify the effectiveness of MESDnet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call