DenseGCN: A multi‐level and multi‐temporal graph convolutional network for action recognition

Chengzhang Yu,Wenxia Bao

doi:10.1049/ipr2.12872

Abstract

AbstractWith the exponential growth of video data, action recognition has become an increasingly important area of study. Despite various advancements, achieving a balance between detection accuracy and lightness remains a formidable challenge, primarily due to the complexity of existing action recognition models. To address this issue, DenseGCN is developed, a lightweight network designed to optimize accuracy and efficiency. The aim was to create a detection model that has high accuracy while remaining lightweight for real‐world applications. DenseGCN operates via a unique three‐level feature fusion system. The initial stage involves the Multi‐level Fusion Network (MlFN), which contains dense connections and a Spatial‐Temporal Fusion Attention module (STF‐Att), designed to eliminate bias in feature extraction caused by deep networks. In the next stage, RefineBone tackles optimization issues in low‐dimensional feature layers by leveraging high‐dimensional feature layers, thus avoiding gradient stacking. Finally, the Multi‐temporal Fusion Feature Pyramid Network (MF‐FPN) generates a discriminative classification feature map by repetitively combining data from multiple dimensions. This strategy has proven successful in refining the extracted feature, allowing for discriminative feature extraction even with a reduced number of channels. This efficient design not only contributes to further research in developing lightweight networks but also offers enhanced possibilities for real‐world implementations. In two large‐scale datasets, NTU RGB+D 60 and 120, DenseGCN outperformed other state‐of‐the‐art methods, achieving an accuracy of 92.7% on the X‐View benchmark of the NTU RGB+D 60 dataset. The DenseGCN is 10.2 × faster and 10 × smaller than the spatial temporal graph attention network (STGAT) proposed in 2022 while retaining very competitive accuracy. The findings suggest that this model significantly improves the quality of feature extraction. As a result, DenseGCN presents a remarkable balance between accuracy and lightness.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IET Image Processing	Publication Date: Aug 1, 2023
Citations: 1	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

DenseGCN: A multi‐level and multi‐temporal graph convolutional network for action recognition

Abstract

Talk to us

Similar Papers

More From: IET Image Processing

Lead the way for us

Similar Papers

MFNet‐LE: Multilevel fusion network with Laplacian embedding for face presentation attacks detection
Sijie Niu ... Tingwei Wang
IET Image Processing | VOL. 15
Sijie Niu, et. al.Sijie Niu ... Tingwei Wang
08 Jul 2021
IET Image Processing | VOL. 15

Skeleton-Based Activity Recognition: Preprocessing and Approaches
Sujan Sarker ... Syeda Faiza Ahmed
-
Sujan Sarker, et. al.Sujan Sarker ... Syeda Faiza Ahmed
01 Jan 2020
01 Jan 2020

A Multiscale Dual-Branch Feature Fusion and Attention Network for Hyperspectral Images Classification
Hongmin Gao ... Zhonghao Chen
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 14
Hongmin Gao, et. al.Hongmin Gao ... Zhonghao Chen
01 Jan 2020
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 14

A framework for mobile activity recognition
Jiahui Wen
-
Jiahui WenJiahui Wen
22 May 2017
22 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DenseGCN: A multi‐level and multi‐temporal graph convolutional network for action recognition

Abstract

Talk to us

Similar Papers

More From: IET Image Processing