Abstract
AbstractViolence can be committed anywhere, even in crowded places. It is hence necessary to monitor human activities for public safety. Surveillance cameras can monitor surrounding activities but require human assistance to continuously monitor every incident. Automatic violence detection is needed for early warning and fast response. However, such automation is still challenging because of low video resolution and blind spots. This paper uses ResNet50v2 and the gated recurrent unit (GRU) algorithm to detect violence in the Movies, Hockey, and Crowd video datasets. Spatial features were extracted from each frame sequence of the video using a pretrained model from ResNet50V2, which was then classified using the optimal trained model on the GRU architecture. The experimental results were then compared with wavelet feature extraction methods and classification models, such as the convolutional neural network and long short‐term memory. The results show that the proposed combination of ResNet50V2 and GRU is robust and delivers the best performance in terms of accuracy, recall, precision, and F1‐score. The use of ResNet50V2 for feature extraction can improve model performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.