Abstract

Skeleton-based action recognition task has been widely studied in recent years. Currently, the most popular researches use graph convolutional network (GCN) to solve this task by modeling human joints data as spatio-temporal graph. However, a large number of long-term temporal motion relationships cannot be effectively captured by GCN. Thus, recurrent neural network (RNN) is introduced to solve this defect. In this work, we propose a model namely graph convolutional network with long time memory (GCN-LTM). Specifically, there are two task streams in our proposed model: GCN stream and RNN stream, respectively. The GCN stream aims to capture the spatial motion relationships as well as the RNN stream focuses on extracting the long-term temporal patterns. In addition, we introduce the contrastive learning strategy to better facilitate feature learning between these two streams. The multiple ablation experiments have verified the feasibility of our proposed model. Numerous experiments show that the proposed model is superior to the current state-of-the-art method under two large-scale datasets including NTU-RGBD and NTU-RGBD-120.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.