Abstract

As one of the most important tasks in intelligent video analysis, video abnormal event detection has been extensively studied. Prior arts have made a great process in designing frameworks to capture spatio-temporal features of video frames. However, video frames usually consist of various objects. It is challenging to grasp the nuances of anomalies against noisy backgrounds. To tackle the bottleneck, we propose a novel Foreground–Background Separation Mutual Generative Adversarial Network (FSM-GAN) framework. The FSM-GAN permits the separation of video frames into the foreground and background. The separated foreground and background are utilized as the input of mutual generative adversarial networks, which transform raw-pixel images in optical-flow representations and vice versa. In the networks, the background is regarded as known conditions and the model focuses on learning the high-level spatio-temporal foreground features to represent the event with the given conditions during the mutual adversarial training. In the test stage, these high-level features instead of low-level visual primitives are utilized to measure the abnormality in the semantic level. Compared with state-of-the-art methods and other abnormal event detection approaches, the proposed framework demonstrates its effectiveness and reliability across various scenes and events.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call