Abstract

Violent action classification in community-based surveillance is a particularly challenging concept in itself. The ambiguity of violence as a complex action can lead to the misclassification of violence-related crimes in detection models and the increased complexity of intelligent surveillance systems leading to greater costs in operations or cost of lives. This paper demonstrates a novel approach to performing automatic violence detection by considering violence as complex actions mitigating oversimplification or overgeneralization of detection models. The proposed work supports the notion that violence is a complex action and is classifiable through decomposition into more identifiable actions that could be easily recognized by human action recognition algorithms. A two-stage framework was designed to detect simple actions which are sub-concepts of violence in a two-stream action recognition architecture. Using a basic logistic regression layer, simple actions were further classified as complex actions for violence detection. Varying configurations of the work were tested, such as applying action silhouettes, varying activation caching sizes, and different pooling methods for post-classification smoothing. The framework was evaluated considering accuracy, recall, and operational speed considering its implications in community deployment. The experimental results show that the developed framework reaches 21 FPS operation speeds for real-time operations and 11 FPS for non-real-time operations. Using the proposed variable caching algorithm, median pooling results in accuracy reaching 83.08% and 80.50% for non-real-time and real-time operations. In comparison, applying max pooling results to recalls reached 89.55% and 84.93% for non-real-time and real-time operations, respectively. This paper shows that complex action decomposition is deemed to be an appropriate method through the comparable performance with existing efforts that have not considered violence as complex actions implying a new perspective for automatic violence detection in intelligent surveillance systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call