Abstract
Current skeleton-based human action recognition methods usually apply to complete skeletons. However, in real scenarios, capturing incomplete or noisy skeletons is inevitable. When some joints information is occluded or interfered, it may significantly reduce the performance of current methods. To improve the robustness of action recognition models, a multi-stream dynamic graph convolutional network (GCN) is proposed to explore sufficient discriminative features distributed on all skeleton joints. By a multi-stream structure, gradient information of the graph structure is aggregated in a progressive manner. We introduce class activation map (CAM) techniques to extract the joints with the maximum amount of information in each stream. The activation maps in each stream are input to the next stream as mask matrices so that the new stream can explore the unactivated parts. Meanwhile, we set up a depth module so that we can still distinguish the characteristic values between nodes after multiple aggregations. Experiments prove that our model achieves the state-of-the-art performance on the NTU-RGB+D dataset. At the same time, it also shows strong robustness on the jittering dataset.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have