A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Yujian Jiang,Xue Yang,Jingyu Liu,Junming Zhang,Zhenhua Tan

doi:10.1155/2021/2290304

Yujian Jiang, Xue Yang + Show 3 more

Open Access

https://doi.org/10.1155/2021/2290304

Copy DOI

Abstract

In skeleton-based human action recognition methods, human behaviours can be analysed through temporal and spatial changes in the human skeleton. Skeletons are not limited by clothing changes, lighting conditions, or complex backgrounds. This recognition method is robust and has aroused great interest; however, many existing studies used deep-layer networks with large numbers of required parameters to improve the model performance and thus lost the advantage of less computation of skeleton data. It is difficult to deploy previously established models to real-life applications based on low-cost embedded devices. To obtain a model with fewer parameters and a higher accuracy, this study designed a lightweight frame-level joints adaptive graph convolutional network (FLAGCN) model to solve skeleton-based action recognition tasks. Compared with the classical 2s-AGCN model, the new model obtained a higher precision with 1/8 of the parameters and 1/9 of the floating-point operations (FLOPs). Our proposed network characterises three main improvements. First, a previous feature-fusion method replaces the multistream network and reduces the number of required parameters. Second, at the spatial level, two kinds of graph convolution methods capture different aspects of human action information. A frame-level graph convolution constructs a human topological structure for each data frame, whereas an adjacency graph convolution captures the characteristics of the adjacent joints. Third, the model proposed in this study hierarchically extracts different levels of action sequence features, making the model clear and easy to understand; further, it reduces the depth of the model and the number of parameters. A large number of experiments on the NTU RGB + D 60 and 120 data sets show that this method has the advantages of few required parameters, low computational costs, and fast speeds. It also has a simple structure and training process that make it easy to deploy in real-time recognition systems based on low-cost embedded devices.

Highlights

Human action recognition can be used in various scenes, such as video retrievals and human-computer interactions [1], so it has been widely discussed in the literature
To solve the problems described above, this study proposed a lightweight hierarchical model called a frame-level joints adaptive graph convolutional network (FLAGCN)
In traditional skeletonbased human action recognition methods, the skeleton is treated as structured data similar to an image, and the spatial relationships between joints are ignored. e ST-Graph convolutional neural network (GCN) introduced a graph convolutional neural network and defined a spatiotemporal skeleton sequence composed of nodes and edges, where nodes refer to the joints in the skeleton and edges are divided into two categories

Summary

Introduction

Human action recognition can be used in various scenes, such as video retrievals and human-computer interactions [1], so it has been widely discussed in the literature. Due to the limitations of data sets, skeleton-based human action recognition researchers have mainly used manual feature-extraction and machinelearning methods. Yan et al first applied a graph convolution method in a skeleton-based human action recognition study [19] and proposed a spatiotemporal graph convolutional network (ST-GCN). Ese methods make the network deeper and the structure of each layer more complex; they often introduce many parameters and extremely difficult training processes and frequently require many computing resources and long training times. These methods place high demands on the computing performance of the utilised equipment and take a long time to predict action sequences in practical applications. In the model proposed in this study, the three-dimensional coordinate features of joints are mainly extracted at the point level, whereas the spatial features of all joints in each frame are extracted at the spatial level and the temporal features of the whole sequence are extracted at the temporal level. erefore, the model is simple, clear, and easy to understand. e ablation experiment confirms that the layered feature-extraction process utilised in this model can effectively improve the recognition accuracy of skeletons with a small number of required parameters

Related Work

Methodologies

Point Level

Spatial Level

Coordinates embedding Features fusion

Experiment

Ablation Study

Findings

Method

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Security and Communication Networks	Publication Date: Nov 1, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Similar Papers

Context Aware Graph Convolution for Skeleton-Based Action Recognition
Dacheng Tao ... Chang Xu
-
Dacheng Tao, et. al.Dacheng Tao ... Chang Xu
01 Jun 2020
01 Jun 2020

Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition.
Shaoyi Du ... Xuehao Gao
IEEE transactions on neural networks and learning systems | VOL. PP
Shaoyi Du, et. al.Shaoyi Du ... Xuehao Gao
01 Jan 2024
IEEE transactions on neural networks and learning systems | VOL. PP

Whole and Part Adaptive Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition.
Lian Zou ... Qi Zuo
Sensors (Basel, Switzerland) | VOL. 20
Lian Zou, et. al.Lian Zou ... Qi Zuo
13 Dec 2020
Sensors (Basel, Switzerland) | VOL. 20

Lightweight graph convolutional network with fusion data for skeleton based action recognition
Qixiang Sun ... Yi Xie
-
Qixiang Sun, et. al.Qixiang Sun ... Yi Xie
12 Oct 2022
12 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks