Abstract

In action recognition, obtaining skeleton data from human poses is valuable. This process can help eliminate negative effects of environmental noise, including changes in background and lighting conditions. Although GCN can learn unique action features, it fails to fully utilize the prior knowledge of human body structure and the coordination relations between limbs. To address these issues, this paper proposes a Multi-level Topological Channel Attention Network algorithm: Firstly, the Multi-level Topology and Channel Attention Module incorporates prior knowledge of human body structure using a coarse-to-fine approach, effectively extracting action features. Secondly, the Coordination Module utilizes contralateral and ipsilateral coordinated movements in human kinematics. Lastly, the Multi-scale Global Spatio-temporal Attention Module captures spatiotemporal features of different granularities and incorporates a causal convolution block and masked temporal attention to prevent non-causal relationships. This method achieved accuracy rates of 91.9% (Xsub), 96.3% (Xview), 88.5% (Xsub), and 90.3% (Xset) on NTU-RGB+D 60 and NTU-RGB+D 120, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call