Abstract
Abnormal event detection in video is alternatively known as outlier detection, where machine learning can be highly effective. While testing an unknown video, the objective of such methods is to verify the video’s category, e.g. normal or abnormal. This paper exploits visual information from normal as well as abnormal videos to train a deep multiple instance learning classifier for classification of videos. Existing multiple instance learning classifiers presume that the training videos contain only short-duration anomalous events. This assumption may not be valid for all real-world anomalies. Also, multiple occurrences of anomalies in training videos cannot be ruled out. This paper shows that by injecting temporal information in feature extraction, anomaly detection performance can be improved. To accomplish this, two spatio-temporal deep feature extractors have been applied in parallel on the training videos. These streams are then used to train a modified multiple instance learning-based classifier. Finally, a fuzzy aggregation is employed to fuse the anomaly scores. Additionally, two lightweight deep-learning classifiers have been used to substantiate the model’s efficacy for classifying fire and accident events. To understand the reliability and performance of the proposed method, extensive experiments have been carried out using UCF-Crime video dataset containing 13 anomaly categories. The dataset has been restructured into five broad categories based on the severity of actions to study the robustness of the proposed method. The paper provides sufficient empirical evidence which proves that by incorporating temporal features in the pipeline, anomaly detection accuracy can be significantly improved. Moreover, the model helps to detect long-duration anomalies in videos, which was not possible using existing methods. The proposed end-to-end multi-stream architecture performs abnormal event detection with accuracy as high as 84.48%, which is better than the performance of existing video anomaly detection methods. Moreover, the class-wise detection accuracy has improved by 6%–14% across various broad categories.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.