Abstract

Automated identification of human activities remains a complex endeavor, particularly in unique settings like temple environments. This study focuses on employing machine learning and deep learning techniques to analyze human activities for intelligent temple surveillance. However, due to the scarcity of standardized datasets tailored for temple surveillance, there is a need for specialized data. In response, this research introduces a pioneering dataset featuring Eight distinct classes of human activities, predominantly centered on hand gestures and body postures. To identify the most effective solution for Human Activity Recognition (HAR), a comprehensive ablation study is conducted, involving a variety of conventional machine learning and deep learning models. By integrating YOLOv4’s robust object detection capabilities with ConvLSTM’s ability to model both spatial and temporal dependencies in spatio-temporal data, the approach becomes capable of recognizing and understanding human activities in sequences of images or video frames. Notably, the proposed YOLOv4-ConvLSTM approach emerges as the optimal choice, showcasing a remarkable accuracy of 93.68%. This outcome underscores the suitability of the outlined methodology for diverse HAR applications in temple environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call