Abstract

Representing 3-D motion-capture sensor data with 2-D color-coded joint distance maps (JDMs) as input to a deep neural network has been shown to be effective for 3-D skeletal-based human action recognition tasks. However, the joint distances are limited by their ability to represent rotational joint movements, which account for a considerable amount of information in human action classification tasks. Moreover, for the subject, view and time invariance in the recognition process, the deep classifier needs training on JDMs along different coordinate axes from multiple streams. To overcome the above shortcomings of JDMs, we propose integrating joint angular movements along with the joint distances in a spatiotemporal color-coded image called a joint angular displacement map (JADM). In the literature, multistream deep convolutional neural networks (CNNs) have been employed to achieve invariance across subjects and views for 3-D human action data, which is achieved by sacrificing training time for accuracy. To improve the recognition accuracy with reduced training times, we propose to test our JADMs with a single-stream deep CNN model. To test and analyze the proposed method, we chose video sequences of yoga. The 3-D motion-capture data represent a complex set of actions with lateral and rotational spatiotemporal variations. We validated the proposed method using 3-D traditional human action data from the publicly available datasets HDM05 and CMU. The proposed model can accurately recognize 3-D yoga actions, which may help in building a 3-D model-based yoga assistant tool.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call