Hierarchical Vertex-Wise Intensification Graph Convolution for Skeleton-Based Activity Recognition
Graph convolutional networks (GCNs), which can effectively captures the spatial and temporal relationships between skeleton joints through graph topology, have shown promising performances in skeleton-based activity recognition in recent years. These methods typically learn the semantic features of the vertices of a skeleton and the associated adjacency matrix. However, how to efficiently establish relationships between vertices still remains a substantial problem. To solve this problem, we propose a novel Hierarchical Vertex-wise Intensification Graph Convolution Network (HVI-GCN) for skeleton-based action recognition. The proposed module dilates input features into higher dimensions to broaden the temporal horizon, and builds a vertex-wise topology based on self-adaptively learned attention. With the adjacency matrix, features from other positions can be collected to aid the prediction of the current position. The proposed module provides a better receptive field and semantic understanding of both the spatial and temporal domains than related methods. Experiments were mainly conducted on the at NTU-RGB-D, NTU-GRB-D 120, and NW-UCLA datasets with joint and bone integrated with motion sequences. Experimental results show that HVI-GCN can improve accuracy by up to 1.1% on the RGB-D 120 dataset. Meanwhile, the accuracy on RGB-D 60 dataset and NW-UCLA dataset can be boosted by 1.4% and 1.2%, respectively.