Abstract

In this study, we developed an AI deep learning-based fighting behavior recognition method for a video surveillance system and proved its effectiveness through various experiments. The proposed method consists of a two-step fighting behavior recognition framework. First, continuous video frames of the target surveillance video are transmitted to the Inflated 3D ConvNet (I3D) network, which shows a good behavior-recognition performance, to extract the spatiotemporal features. These extracted 3D features are then used as the inputs in the next step, where a fight situation is detected using a classification model consisting of a fully connected layer. To use the proposed aggressive behavior detection framework effectively, first, it is necessary to train the fight detection model. However, it is not possible to collect sufficient fighting videos in various outdoor environments. To overcome this limitation, we generated a large amount of learning data through data augmentation. Therefore, instead of directly learning from the training videos transmitted to the I3D network, the classifier trains itself to recognize various fighting actions using the Kinetics video dataset. That is, the action features are extracted from the transmitted consecutive frames using the pretrained I3D network and subsequently used to train the fully connected layer classification model. In addition, we proposed a learning method that includes recognizing ambiguous conflict boundaries using multiple instance learning to mitigate the ambiguous starting and ending of the contention videos. The effectiveness of the proposed method was verified through several experiments by drawing comparisons between the present results and those of the previously reported studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call