Abstract

Previous works have realized that spatio-temporal entanglement features can not be ignored in skeleton-based motion recognition tasks, then they have not broken away from the barriers of traditional GCN (The entanglement feature is still modeled by the extended single-frame adjacency matrix). We introduce a new joint-correlations determination mechanism that uses a non-linear transformation of the distance between joints in multiple frames to construct the connection relationship. The proposed method results in improved accuracy while significantly reducing the number of parameters. Meanwhile, recent works have alleviated the problem of most actions being only related to the dynamic characteristics of local joints by aggregating features of different parts of the human body in parallel, while interacting with different features still remains at a lower level of concatenation or addition. We propose a progressive inward-outward structure (PIS) that allows joint features corresponding to the action to be extracted while taking into account the lightweight link between this part of the joints and the rest. Integrating the above two designs, we propose a Spatiotemporal Progressive Inward-Outward Aggregation Network (SPIANet) to model the complex spatiotemporal entanglement between joints in the process of human motion, which is validated on three public datasets (NTU-RGB+D60, NTU-RGB+D120, and UESTC varying-view) and outperforms state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call