Multiple Input Branches Shift Graph Convolutional Network with DropEdge for Skeleton-Based Action Recognition

Yan Liu,Yuelin Deng,Jinping Su,Ruonan Wang,Chi Li

doi:10.1007/978-3-031-06427-2_49

Abstract

AbstractGraph Convolutional Networks (GCNs) achieve remarkable success in the skeleton-based action recognition tasks. However, the recent state-of-the-art (SOTA) methods for this task usually have a large model size and too heavy computational complexity. In this work, we propose an early fused model, Multiple Input Branches Shift Graph Convolutional Network with DropEdge (MIBSD-GCN). First, to reduce the complexity of the multi-stream model, we introduce a lightweight Shift Graph Convolutional Network (Shift-GCN) block. It is embedded into an early fused architecture, Multiple Input Branches (MIB), which can enrich input features and suppresses the model redundancy. Then, a novel spherical coordinate representation is added as one of the input branches to enhance the recognition effect. Finally, we design the Shift Graph Convolutional Network with DropEdge (SD-GCN) to prevent over-fitting and over-smoothing, while maintain the model accuracy. Extensive experiments on two large-scale datasets, NTU RGB+D 60 and NTU RGB+D 120, show that the proposed model outperforms previous SOTA methods. We achieve 96.6% accuracy on the Cross-view benchmark of the NTU RGB+D 60, while being 3.4–16.5 times fewer FLOPs than other SOTA models.KeywordsSkeleton-based action recognitionMultiple Input BranchesShift Graph Convolutional NetworkDropEdgeSpherical coordinate

Full Text