Abstract

AbstractSkeleton-based action recognition has attracted increasing attention in recent years. However, current skeleton-based action recognition models still exhausted huge parameters and computations to achieve superior accuracy. Despite effectiveness, the huge parameter/computation cost degrades the application of action recognition models on edge devices, such as mobile. How to obtain high accuracy while maintaining low computational/parameter efficiency remains a difficult yet significant challenge. In light of the above issues, we propose group-shuffle graph convolutional networks (GS-GCNs) for lightweight skeleton-based action recognition in videos. Specifically, GS-GCNs consist of two sequential modules: group-shuffle graph convolutional module (GSC) and depthwise-shuffle separable convolution module (DSC). GSC divides input features into several groups through feature channels, then shuffles the groups and sends each group into a discrete sub GCN to model relationships between each node in the skeleton. After that, DSC completes depthwise separable convolution on each group and shuffles each group. The final output is the concatenation of all group features. Essentially, through a shuffle-grouping strategy, GS-GCNs could significantly reduce the computational/parameter cost while obtaining competitive detection ability through an architecture of iterations. Extensive experiments show that GS-GCN achieves excellent performance on both NTU-RGB+D and NTU-RGB+D 120 datasets with an order of smaller model size than most previous works.KeywordsAction recognitionGraph convolutional networkLightweight model

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call