Abstract

Skeleton-based action recognition has attracted increasing interest in recent years. With the flexibility of modeling long-range dependency of joints, the self-attention module has served as the basic component in skeleton-based action recognition. However, the global receptive field of self-attention is not conducive to the modeling of skeleton locality, and the self-attention model is imbued with less inductive bias, which leads to overfitting. In this paper, we propose an attention graph convolutional network (AGCN) with multi-scale sampling to effectively model the local and global features of the skeleton. Firstly, we propose two extreme sampling strategies for generating and ordering neighboring nodes of root nodes. A local-first sampling method is introduced to construct local graph windows, and a global-first sampling method is proposed to assemble long-range joints for constructing global graph windows. The local-first sampling and global-first sampling introduce more skeleton-specific inductive biases to regularize the model capacity. Secondly, the AGCN combines the self-attention mechanism with graph convolution operation, which alleviates the over-smoothing of graph convolution and preserves the translation invariant. Based on the multi-scale sampling strategy, the AGCN can effectively model the locality and non-locality of the skeleton. Finally, by coupling the aforementioned proposals, we develop a two-pathway model for multi-scale feature fusion. Extensive experiments demonstrate that our model could achieve comparable performance with state-of-the-art works on the NTU RGB+D 60, NTU RGB+D 120, the UAV-HUMAN and NW-UCLA datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.