Abstract
Skeleton-based action recognition aims to recognize human actions by exploring the inherent characteristics from the given skeleton sequences and has attracted far more attention due to its great important potentials in practical applications. Previous methods have illustrated that learning discriminative spatial and temporal features from the skeleton sequences is a crucial factor to recognize human actions. Nevertheless, how to model spatio-temporal evolutions is still a challenging problem. In this work, we propose a novel model with hierarchical spatial reasoning and temporal stack learning network (HSR-TSL) to explore the discriminative spatial and temporal features for human action recognition, which consists of a hierarchical spatial reasoning network (HSRN) and a temporal stack learning network (TSLN). Specifically, the HSRN employs a hierarchical residual graph neural network to capture two-level spatial features: intra spatial information of each part and body-level structural information between each part. The TSLN models the detailed temporal dynamics of skeleton sequences by a composition of multiple skip-clip LSTMs. During training, we develop a clip-based incremental loss to effectively optimize the model. We perform extensive experiments on five challenging benchmarks to verify the effectiveness of each component of our model. The comparison results illustrate that our approach significantly boosts the performances for skeleton-based action recognition.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.