Rethinking the ST-GCNs for 3D skeleton-based human action recognition

Wei Peng,Jingang Shi,Tuomas Varanka,Guoying Zhao

doi:10.1016/j.neucom.2021.05.004

Abstract

The skeletal data has been an alternative for the human action recognition task as it provides more compact and distinct information compared to the traditional RGB input. However, unlike the RGB input, the skeleton data lies in a non-Euclidean space that traditional deep learning methods are not able to use their fullest potential. Fortunately, with the emerging trend of Geometric deep learning, the spatial-temporal graph convolutional network (ST-GCN) has been proposed to deal with the action recognition problem from skeleton data. ST-GCN and its variants fit well with skeleton-based action recognition and are becoming the mainstream frameworks for this task. However, the efficiency and the performance of the task are hindered by either fixing the skeleton joint correlations or providing a computational expensive strategy to construct a dynamic topology for the skeleton. We argue that many of these operations are either unnecessary or even harmful for the task. By theoretically and experimentally analysing the state-of-the-art ST-GCNs, we provide a simple but efficient strategy to capture the global graph correlations and thus efficiently model the representation of the input graph sequences. Moreover, the global graph strategy also reduces the graph sequence into the Euclidean space, thus a multi-scale temporal filter is introduced to efficiently capture the dynamic information. With the method, we are not only able to better extract the graph correlations with much fewer parameters (only 12.6% of the current best), but we also achieve a superior performance. Extensive experiments on current largest 3D datasets, NTU-RGB+D and NTU-RGB+D 120, demonstrate the ability of our network to perform efficient and lightweight priority on this task.

Highlights

Human action recognition is a valuable but challenging topic which has attracted substantial attention from different research areas in recent years, since it provides significant insights into many valuable fields like action surveillance, human behavior analysis, pedestrian tracking, and robotics
Our method can be treated as variant of Graph Convolutional Networks (GCN), but we explore a better way to capture the graph information with a more compact model
To make the paper selfcontained, we briefly review how to model a spatial graph with GCNs first

Summary

Introduction

Human action recognition is a valuable but challenging topic which has attracted substantial attention from different research areas in recent years, since it provides significant insights into many valuable fields like action surveillance, human behavior analysis, pedestrian tracking, and robotics. The GCNs are far from being efficient for small-scaled graph tasks These methods will first find the neighbor to build a local perceptive field, a neural network is constructed. Our method provides a simple but efficient way to capture global graph representation for small scale graph data, which does not require the pre-defined topology matrix and even make the method much more convenient. This method can be utilized as an alternative to the current GCNs. Extensive experiments are conducted on two current largest benchmarks. Comparison results show our superiority and present its effectiveness since its model size is only 12.6% of the current best method, i.e., NAS-GCN [15]

Related work

Methodology

GCN preliminaries

Uniform formulation of the ST-GCNs

Experiments

Datasets and metrics

Experiment settings

Methods

Comparison with the state-of-the-art methods

Ablation experiments

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neurocomputing	Publication Date: May 6, 2021
Citations: 36	License type: cc-by

R Discovery Prime

R Discovery Prime

Rethinking the ST-GCNs for 3D skeleton-based human action recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Attention module-based spatial–temporal graph convolutional networks for skeleton-based action recognition
Yinghui Kong ... Li Li
Journal of Electronic Imaging | VOL. 28
Yinghui Kong, et. al.Yinghui Kong ... Li Li
30 Aug 2019
Journal of Electronic Imaging | VOL. 28

A Self-Attention Augmented Graph Convolutional Clustering Networks for Skeleton-Based Video Anomaly Behavior Detection
Chengming Liu ... Lei Shi
Applied Sciences | VOL. 12
Chengming Liu, et. al.Chengming Liu ... Lei Shi
21 Dec 2021
Applied Sciences | VOL. 12

A spatial attentive and temporal dilated (SATD) GCN for skeleton‐based action recognition
Jiaxu Zhang ... Jun Liu
CAAI Transactions on Intelligence Technology | VOL. 7
Jiaxu Zhang, et. al.Jiaxu Zhang ... Jun Liu
17 Mar 2021
CAAI Transactions on Intelligence Technology | VOL. 7

Enhanced Spatial and Extended Temporal Graph Convolutional Network for Skeleton-Based Action Recognition.
Fanjia Li ... Juanjuan Li
Sensors | VOL. 20
Fanjia Li, et. al.Fanjia Li ... Juanjuan Li
15 Sep 2020
Sensors | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rethinking the ST-GCNs for 3D skeleton-based human action recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neurocomputing