Abstract

To address the problems of the traditional human motion gesture tracking and recognition methods, such as poor tracking effect, low recognition accuracy, high frame loss rate, and long-time cost, a dynamic human motion gesture tracking and recognition algorithm using multimode deep learning was proposed. Firstly, the collected human motion images are repaired in the three-dimensional (3D) environment, and the multimodal 3D human motion model is reconstructed using the processed images. Secondly, according to the results of model reconstruction, the camera gesture and other parameters of the keyframe are used to construct the target tracking optimization function so as to achieve the purpose of accurate tracking of human motion. Finally, for multimodal human motion gesture learning, a convolutional neural network (CNN) is developed. The trained CNN is utilized to complete dynamic human motion recognition after convolutional and pooling calculations. The results demonstrate that the proposed algorithm is effective in tracking human motion gestures. The average recognition accuracy is 96%, the average frame loss rate is 8.8%, the time cost is low, and the proposed algorithm has a high F-measure and much lower power consumption than other algorithms, indicating that the proposed algorithm is effective.

Highlights

  • To address the problems of the traditional human motion gesture tracking and recognition methods, such as poor tracking effect, low recognition accuracy, high frame loss rate, and long-time cost, a dynamic human motion gesture tracking and recognition algorithm using multimode deep learning was proposed

  • For multimodal human motion gesture learning, a convolutional neural network (CNN) is developed. e trained CNN is utilized to complete dynamic human motion recognition after convolutional and pooling calculations. e results demonstrate that the proposed algorithm is effective in tracking human motion gestures. e average recognition accuracy is 96%, the average frame loss rate is 8.8%, the time cost is low, and the proposed algorithm has a high F-measure and much lower power consumption than other algorithms, indicating that the proposed algorithm is effective

  • Introduction e development of machine vision is inseparable from the wide application of Artificial Intelligence (AI) technology, and at present, the tracking and recognition of motion targets is a hotspot in the research on target tracking [1, 2]

Read more

Summary

Introduction

To address the problems of the traditional human motion gesture tracking and recognition methods, such as poor tracking effect, low recognition accuracy, high frame loss rate, and long-time cost, a dynamic human motion gesture tracking and recognition algorithm using multimode deep learning was proposed. Literature [10] is based on a deep learning approach for multimodal complex activity recognition, which is an end-to-end model, which designs specific subnetwork architectures for different sensor data, merges the outputs of all subnetworks to extract fused features, and uses neural networks to learn human motion sequence information, which is overall more complex and time-consuming. (3) In the process of motion recognition, in order to ensure the effect of deep learning and reduce the corresponding training cost, CNN is introduced after convolutional computation and pool computation to reduce the computational complexity of multimodal training samples and improve the algorithm performance Based on multimodal deep learning, the main contributions of this paper are as follows: (1) e dynamic 3D image is restored by reconstructing the multimodal human 3D motion model, which provides a reference for target tracking and recognition and lays a solid foundation for subsequent tracking and recognition. (2) e target tracking optimization function is constructed by using the camera gesture and other parameters of the keyframe to improve the tracking accuracy of human motion gesture. (3) In the process of motion recognition, in order to ensure the effect of deep learning and reduce the corresponding training cost, CNN is introduced after convolutional computation and pool computation to reduce the computational complexity of multimodal training samples and improve the algorithm performance

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call