Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network

Hayat Ullah,Arslan Munir

doi:10.3390/a16080369

Abstract

The recognition of human activities using vision-based techniques has become a crucial research field in video analytics. Over the last decade, there have been numerous advancements in deep learning algorithms aimed at accurately detecting complex human actions in video streams. While these algorithms have demonstrated impressive performance in activity recognition, they often exhibit a bias towards either model performance or computational efficiency. This biased trade-off between robustness and efficiency poses challenges when addressing complex human activity recognition problems. To address this issue, this paper presents a computationally efficient yet robust approach, exploiting saliency-aware spatial and temporal features for human action recognition in videos. To achieve effective representation of human actions, we propose an efficient approach called the dual-attentional Residual 3D Convolutional Neural Network (DA-R3DCNN). Our proposed method utilizes a unified channel-spatial attention mechanism, allowing it to efficiently extract significant human-centric features from video frames. By combining dual channel-spatial attention layers with residual 3D convolution layers, the network becomes more discerning in capturing spatial receptive fields containing objects within the feature maps. To assess the effectiveness and robustness of our proposed method, we have conducted extensive experiments on four well-established benchmark datasets for human action recognition. The quantitative results obtained validate the efficiency of our method, showcasing significant improvements in accuracy of up to 11% as compared to state-of-the-art human action recognition methods. Additionally, our evaluation of inference time reveals that the proposed method achieves up to a 74× improvement in frames per second (FPS) compared to existing approaches, thus showing the suitability and effectiveness of the proposed DA-R3DCNN for real-time human activity recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Jul 31, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network

Abstract

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Learning correlations for human action recognition in videos
Yun Yi ... Bowen Zhang
Multimedia Tools and Applications | VOL. 76
Yun Yi, et. al.Yun Yi ... Bowen Zhang
10 Feb 2017
Multimedia Tools and Applications | VOL. 76

Human action recognition in surveillance video of a computer laboratory
Abdul-Lateef Yussiff ... Yong Suet-Peng
-
Abdul-Lateef Yussiff, et. al.Abdul-Lateef Yussiff ... Yong Suet-Peng
01 Aug 2016
01 Aug 2016

Human Action Recognition in Video via Fused Optical Flow and Moment Features – Towards a Hierarchical Approach to Complex Scenario Recognition
Kathy Clawson ... Jun Liu
-
Kathy Clawson, et. al.Kathy Clawson ... Jun Liu
01 Jan 2014
01 Jan 2014

Actlets: A novel local representation for human action recognition in video
Muhammad Muneeb Ullah ... Ivan Laptev
-
Muhammad Muneeb Ullah, et. al.Muhammad Muneeb Ullah ... Ivan Laptev
01 Sep 2012
01 Sep 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network

Abstract

Talk to us

Similar Papers

More From: Algorithms