Abstract

Skeleton based human action recognition has attracted more and more attentions recently thanks to the accessibility of depth sensors and the development of pose estimation techniques. Conventional approaches such as convolutional neural networks usually model skeletons with grid-shaped representations, which cannot explicitly explore the dependency between two correlated joints. In this paper, we treat the skeleton as a single graph with joints as nodes and bones as edges. Based on the skeleton graph, the improved graph convolutional network called adaptive multi-view graph convolutional networks(AMV-GCNs) is proposed to deal with skeleton based action recognition. We firstly construct a novel skeleton graph and two kinds of graph nodes are defined to model the spatial configuration and temporal dynamics respectively. Then the generated graphs along with feature vectors on graph nodes are fed into AMV-GCNs. In AMV-GCNs, an adaptive view transformation module is designed to reduce the impact of view diversity. Proposed module can automatically determine suitable viewpoints and transform skeletons to new representations under those viewpoints for better recognition. Further, we employ multiple GCNs based streams to utilize and learn action information from different viewpoints. Finally, the classification scores from multiple streams are fused to provide the recognition result. Extensive experimental evaluations on four challenging datasets, NTU RGB+D 60, NTU RGB+D 120, Northwestern-UCLA and UTD-MHAD, demonstrate the superiority of our proposed network.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call