Abstract

Action recognition with skeleton data is a challenging task in computer vision. Graph convolutional networks (GCNs), which directly model the human body skeletons as the graph structure, have achieved remarkable performance. However, current architectures of GCNs are limited to the small receptive field of convolution filters, only capturing local physical dependencies among joints and using all skeleton data indiscriminately. To address these limitations and to achieve a flexible graph representation of the skeleton features, we propose a novel semantics-guided graph convolutional network (Sem-GCN) for skeleton-based action recognition. Three types of semantic graph modules (structural graph extraction module, actional graph inference module and attention graph iteration module) are employed in Sem-GCN to aggregate L-hop joint neighbors' information, to capture action-specific latent dependencies and to distribute importance level. Combing these semantic graphs into a generalized skeleton graph, we further propose the semantics-guided graph convolution block, which stacks semantic graph convolution and temporal convolution, to learn both semantic and temporal features for action recognition. Experimental results demonstrate the effectiveness of our proposed model on the widely used NTU and Kinetics datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.