Three-dimensional human pose estimation based on spatio-temporal multi-feature fusion network

Jun Ye,Yun Zhang,Fengping Wang,Srikanta Patnaik,Tao Shen

doi:10.1117/12.2644574

Abstract

The current common 3D human pose estimation algorithms achieve good results in representation learning, but there are still problems such as poor estimation accuracy at the human skeleton joint points, so how to use redundant 2D pose sequence spatio-temporal information from monocular RGB images to estimate the human pose in an effective way is a research challenge, this paper proposes a 3D human pose estimation based on spatio-temporal multi-feature fusion network. The algorithm specifically combines a spatio-temporal multi-feature fusion hierarchical method of image appearance information and motion timing information, which uses a compact convolutional neural network to learn spatio-temporal information to model 2D joint point position information as 3D joint point position. Experimental results show that the proposed method currently achieves more advanced end-to-end pose estimation accuracy and does not require any post-processing stage of the pose optimization method, and the experimental results show that the pose estimation obtained in this paper is effectively improved in terms of average accuracy, proving that the method in this paper can effectively improve the accuracy of human pose estimation.

Full Text