Recognition of Real-Time Video Activities Using Stacked Bi-GRU with Fusion-based Deep Architecture

Ujwala Thakur,Amarjeet Prajapati,Ankit Vidyarthi

doi:10.3897/jucs.113095

Abstract

Recognizing and understanding human activities in real-time videos is a challenging task due to the complex nature of video data and the need for efficient and accurate analysis. This research pioneers a breakthrough in video activity recognition by introducing a robust framework leveraging the power of a stacked Bidirectional Long Short-Term Memory (Bi-LSTM) and Gated Recurrent Unit (GRU) architecture, harmonized within a fusion-based deep model. The stacked Bi-LSTM-GRU model capitalizes on its dual recurrent architecture, capturing nuanced temporal dependencies within video sequences. The fusion-based deep architecture synergizes spatial and temporal features, enabling the model to discern intricate patterns in human activities. To further enhance the discriminative power of the model, we introduce a fusion module in the proposed deep architecture. The fusion module integrates multi-modal features extracted from different levels of the network hierarchy, allowing for a more comprehensive representation of video activities. We demonstrate the efficacy of our approach through rigorous experimentation on UCF50, UCF101, and HMDB51 datasets. In experiments on the UCF50 dataset, our model achieves an accuracy of 97.01% and 95.86% on training and validation sets respectively, showcasing its proficiency in discerning activities across a diverse range of scenarios. The evaluation extends to the UCF101 dataset, where the proposed approach achieves a competitive accuracy of 97.62% and 96.93% on training and validation sets, surpassing previous benchmarks by a margin of approx 1%. Further-more, on the challenging HMDB51 dataset, the model demonstrates a robust accuracy of 89.71%and 88.88% on training and validation sets, solidifying its efficacy in intricate action recognition tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Recognition of Real-Time Video Activities Using Stacked Bi-GRU with Fusion-based Deep Architecture

Abstract

Talk to us

Similar Papers

More From: JUCS - Journal of Universal Computer Science

Lead the way for us

Journal: JUCS - Journal of Universal Computer Science	Publication Date: Sep 28, 2024
License type: CC BY-ND 4.0

Similar Papers

EMO-MoviNet: Enhancing Action Recognition in Videos with EvoNorm, Mish Activation, and Optimal Frame Selection for Efficient Mobile Deployment.
Tarique Hussain ... Tanvir Alam
Sensors (Basel, Switzerland) | VOL. 23
Tarique Hussain, et. al.Tarique Hussain ... Tanvir Alam
27 Sep 2023
Sensors (Basel, Switzerland) | VOL. 23

Effective Video Event Detection Using Optimized Bidirectional Long Short-Term Memory Network
Susmitha Alamuru ... Sanjay Jain
International Journal of Information Technology & Decision Making | VOL. -
Susmitha Alamuru, et. al.Susmitha Alamuru ... Sanjay Jain
05 Jul 2023
International Journal of Information Technology & Decision Making | VOL. -

Leveraging Transfer Learning for Spatio-Temporal Human Activity Recognition from Video Sequences
Umair Muneer Butt ... Sukumar Letchmunan
Computers, Materials & Continua | VOL. 74
Umair Muneer Butt, et. al.Umair Muneer Butt ... Sukumar Letchmunan
01 Jan 2023
Computers, Materials & Continua | VOL. 74

Multi-stream with Deep Convolutional Neural Networks for Human Action Recognition in Videos
Xiao Liu ... Xudong Yang
-
Xiao Liu, et. al.Xiao Liu ... Xudong Yang
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recognition of Real-Time Video Activities Using Stacked Bi-GRU with Fusion-based Deep Architecture

Abstract

Talk to us

Similar Papers

More From: JUCS - Journal of Universal Computer Science