Abstract

Automatically discovering anomalous events and objects from surveillance videos plays an important role in real-world application and has attracted considerable attention in computer vision community. However it is still a challenging issue. In this paper, a novel approach for automatic anomaly detection is proposed. Our approach is highly efficient; thus it can perform real-time detection. Furthermore, it can also handle multiscale detection and can cope with spatial and temporal anomalies. Specifically, local features capturing both appearance and motion characteristics of videos are extracted from spatiotemporal video volume (STV). To bridge the large semantic gap between low-level visual feature and high-level event, we use the middle-level visual attributes as the intermediary. And these three-level framework is modeled as an extreme learning machine (ELM). We propose to use the spatiotemporal pyramid (STP) to capture the spatial and temporal continuity of an anomalous even, enabling our approach to cope with multiscale and complicated events. Furthermore, we propose a method to efficiently update the ELM; thus our approach is self-adaptive to background change which often occurs in real-world application. Experiments on several datasets are carried out and the superior performance of our approach compared to the state-of-the-art approaches verifies its effectiveness.

Highlights

  • Surveillance systems have been widely used in the city, and detecting anomalous events from the system plays an important role in real world

  • To address this challenging problem, in this paper we propose a novel automatic anomaly detection approach with extreme learning machine (ELM) [25] based visual attribute and spatiotemporal pyramid (STP)

  • We can continue segmenting any small spatiotemporal video volume (STV) into another 8 smaller STVs. This leads to much more computational complexity and we find that a two-level STP can achieve satisfactory performance for anomaly detection

Read more

Summary

Introduction

Densely sampled local spatiotemporal descriptor which represents both motion and appearance characteristics are utilized in [20, 21], and they can possess some degree of robustness to unimportant variations in surveillance video They construct models to capture the relationship between low-level visual features and high-level semantic event. To address this challenging problem, in this paper we propose a novel automatic anomaly detection approach with extreme learning machine (ELM) [25] based visual attribute and spatiotemporal pyramid (STP) The former one focuses on the relationship between low-level visual features and high-level event and the latter can capture the spatial and temporal continuity of an event. (i) A novel approach for automatic anomaly detection is proposed It is based on densely sampled STVs. Visual attribute is utilized to bridge the semantic gap and we use ELM to model this three-level framework.

Related Work
Detection via ELM-Based Visual Attribute
Spatiotemporal Pyramid
Experiment
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call