Abstract

In contemporary research on human action recognition, most methods separately consider the movement features of each joint. However, they ignore that human action is a result of integrally cooperative movement of each joint. Regarding the problem, this paper proposes an action feature representation, called Motion Collaborative Spatio-Temporal Vector (MCSTV) and Motion Spatio-Temporal Map (MSTM). MCSTV comprehensively considers the integral and cooperative between the motion joints. MCSTV weighted accumulates limbs’ motion vector to form a new vector to account for the movement features of human action. To describe the action more comprehensively and accurately, we extract key motion energy by key information extraction based on inter-frame energy fluctuation, project the energy to three orthogonal axes and stitch them in temporal series to construct the MSTM. To combine the advantages of MSTM and MCSTV, we propose Multi-Target Subspace Learning (MTSL). MTSL projects MSTM and MCSTV into a common subspace and makes them complement each other. The results on MSR-Action3D and UTD-MHAD show that our method has higher recognition accuracy than most existing human action recognition algorithms.

Highlights

  • Human action recognition [1] is a research hotspot in the field of artificial intelligence and pattern recognition

  • Map; Multi-Target Subspace Learning; key information extraction based on inter-frame energy fluctuation

  • Motion Spatio-Temporal Map (MSTM) can accurately describe the spatial structure [14] and temporal information [15] of actions, we extract key motion energy by key information extraction based on inter-frame energy fluctuation, and project the key energy of body on three orthogonal axes and stitched according to temporal series to form three-axis MSTMs

Read more

Summary

Introduction

Human action recognition [1] is a research hotspot in the field of artificial intelligence and pattern recognition. The research achievements have been used in many aspects of life [2], such as human-computer interaction, biometrics, health monitoring, video surveillance systems, somatosensory game, robotics, etc. Due to the development of lower-cost depth sensors, deep cameras have been widely used in action recognition. Green and blue (RGB) cameras, the depth camera is not sensitive to lighting conditions [4]. It is easy to distinguish the background and foreground, and provides human depth data. Human skeletal information can be obtained from the depth map

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.