Abstract

Skeleton-based action recognition has made great progress recently. However, many problems still remain unsolved. For example, the representations of skeleton sequences learned by most of the existing methods lack spatial structure information and detailed temporal dynamics features. To this end, we propose a novel Deep Stacked Bidirectional LSTM Network (DSB-LSTM) for human action recognition from skeleton data. Specifically, we first exploit human body geometry to extract the skeletal modulus ratio features (MR) and the skeletal vector angle features (VA) from the skeletal data. Then, the DSB-LSTM is applied to learning both the spatial and temporal representation from MR features and VA features. This network not only leads to more powerful representation but also stronger generalization capability. We perform several experiments on the MSR Action3D dataset, Florence 3D dataset and UTKinect-Action dataset. And the results show that our approach outperforms the compared methods on all datasets, demonstrating the effectiveness of the DSB-LSTM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call