Abstract

Skeleton-based human action recognition has drawn more and more attention due to its easy implementation and stable application in intelligent human-robot interaction. However, most existing studies only used the skeleton data but completely ignored other image semantic information to build action recognition models, which would confuse the recognition of similar actions because of the ambiguity between skeleton data. Here, a center-connected graph convolutional network enhanced with salient image features (SIFE-CGCN) is proposed to address the problem of similar action recognition. First, a center-connected network (CGCN) is constructed to capture the small differences between similar actions through exploring the possible collaboration between all joints. Subsequently, a metric of movement changes is employed to optimally select the salient image from an action video, and then the EfficientNet is used to achieve the action semantic classification of the salient images. Finally, the recognition results of CGCN are strengthened with the classification results of salient images to further improve the recognition accuracy for similar actions. Additionally, a metric is proposed to measure the action similarity with the skeleton data, and then a similar action dataset is built. Extensive experiments on the datasets of similar action and NTU RGB + D 60/120 were conducted to verify the performance of the proposed methods. Experimental results validated the effectiveness of salient image feature enhancement and showed that the proposed SIFE-CGCN achieved the state-of-the-art performance on the similar action and NTU RGB + D 60/120 datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call