Abstract

Instantaneous detection of violence is still an unsolved research problem although artificial intelligence lives its prosperous years. The severity of injury causes due to violence can be minimized by detecting violence in real time demands for effective violence detection. Various methods were previously proposed for violence detection which could not provide robust results due many challenges, i.e. noise, motion estimation, lack of appropriate feature selection, lack of effective classification approach, complex background and variations in illumination. This research proposes an efficient method for violence detection using moment features to use motion patterns to facilitate detection in each frame and provides smaller area as region of interest. This means probability for extraction of motion intensity is getting lost because of same colored object in the background is reduced and thus minimizes background complexity. After that, proposed method uses optical flow to calculate angles and linear distances in each frame. In this context, if there is any frame loss due to noise or illumination variation, proposed method uses Kalman filter to process that frame by illuminating noise. Finally, decision for violence is determined using random forest classifier from single feature vector by generating a set of probabilities for each class. Proposed research performed extensive experimentation where accuracy rate of 99.12% was achieved using frame rate of 35 fps which is higher comparing with previous research results. Experimental results reveal the effectiveness of the proposed methodology.

Highlights

  • Surveillance applications have been used to monitor public and private areas where intelligent violence detection is still an unsolved research problem

  • Research in [21] illustrated an end to end deep neural network for violence detection using surveillance cameras. They extracted set of selectively distributed frames from video in lieu with passing spatiotemporal features to a fully connected neural network in order to classify violence or non violence action. They created spatio-temporal features by performing features extraction using both space and time dimensions through a custom build convolutional neural network and long short term memory LSTM recurrent neural network, validation against computation time or processing time per frame was ignored in their research

  • For C3D and CNN-LSTM they achieved accuracy rate of 63% and 61% respectively using 25 fps frame rate. Their accuracy was not promising, usages of two deep neural networks (DNNs) were robust on learning high level spatial-temporal information from raw image data. They combined features maps obtained from C3D and CNN with long short-term memory (CNNLSTM) networks by designing shallow neural network which acted as third scenario in their research demands for further validation to establish their overall research in terms with computational time

Read more

Summary

Introduction

Surveillance applications have been used to monitor public and private areas where intelligent violence detection is still an unsolved research problem. Research in [20] used convolutional neural network (C3D) and CNN with long short-term memory (CNN-LSTM) through shallow neural network to learn high level spatial-temporal information from raw image data for violence detection Their proposed method was suitable for still images. Research in [20] applied two deep neural networks (DNNs), i.e. 3D-based convolutional neural network (C3D) and CNN with long short-term memory (CNN-LSTM) for learning high level spatial-temporal information from raw image data They combined features map achieved from C3D and CNN-LSTM through designing a shallow neural network. Research in [23] proposed real time descriptor to model crowd dynamics by encoding variations in crowd texture by implicating temporal summaries of grey level co-occurrence matrix (GLCM) features They measured inter-frame uniformity and illustrated that violent behavior varies in a less uniform manner. For discontinuous and fast motion their proposed method did not provide robust performance

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call