Abstract

Dynamic skeletal data has been widely studied for human action tasks due to its high-level semantic information and less data than RGB features. However, attention-based previous methods fail to focus on the local grouped joint dependence of the human body, which is vital to distinguishing various actions in fine-grained tasks, such as skeletal action segmentation and recognition. This work proposes spatial focus attention for the fine-grained skeleton-based action tasks. Specifically, we decouple the attention map to enhance the grouped joint dependence adaptively by the decouple probability. To further focus on local grouped dependence, the tree structural attention maps can be built by hierarchical decoupling and guide the model to focus on complementary local dependence in the different leaf nodes. Our proposed approach achieves state-of-the-art performance on fine-grained skeleton-based human action segmentation tasks (MCFS-22) and recognition tasks (FSD-10). Besides, on the coarse-grained dataset (NTU-60), the proposed spatial focus attention also achieves outstanding performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call