Graph Convolutional Network with Long Time Memory for Skeleton-based Action Recognition

Yanpeng Qi,Chen Pang,Hong Liu,Yiliang Liu,Lei Lyu

doi:10.1109/cscwd57460.2023.10152568

Abstract

Skeleton-based action recognition task has been widely studied in recent years. Currently, the most popular researches use graph convolutional network (GCN) to solve this task by modeling human joints data as spatio-temporal graph. However, a large number of long-term temporal motion relationships cannot be effectively captured by GCN. Thus, recurrent neural network (RNN) is introduced to solve this defect. In this work, we propose a model namely graph convolutional network with long time memory (GCN-LTM). Specifically, there are two task streams in our proposed model: GCN stream and RNN stream, respectively. The GCN stream aims to capture the spatial motion relationships as well as the RNN stream focuses on extracting the long-term temporal patterns. In addition, we introduce the contrastive learning strategy to better facilitate feature learning between these two streams. The multiple ablation experiments have verified the feasibility of our proposed model. Numerous experiments show that the proposed model is superior to the current state-of-the-art method under two large-scale datasets including NTU-RGBD and NTU-RGBD-120.

Full Text