Abstract

Using video sequences to restore 3D human poses is of great significance in the field of motion capture. This paper proposes a novel approach to estimate 3D human action via end-to-end learning of deep convolutional neural network to calculate the parameters of the parameterized skinned multi-person linear model. The method is divided into two main stages: (1) 3D human pose estimation based on a single frame image. We use 2D/3D skeleton point constraints, human height constraints, and generative adversarial network constraints to obtain a more accurate human-body model. The model is pre-trained using open-source human pose datasets; (2) Human-body pose generation based on video streams. Combined with the correlation of video sequences, a 3D human pose recovery method based on video streams is proposed, which uses the correlation between videos to generate a smoother 3D pose. In addition, we compared the proposed 3D human pose recovery method with the commercial motion capture platform to prove the effectiveness of the proposed method. To make a contrast, we first built a motion capture platform through two Kinect (V2) devices and iPi Soft series software to obtain depth-camera video sequences and monocular-camera video sequences respectively. Then we defined several different tasks, including the speed of the movements, the position of the subject, the orientation of the subject, and the complexity of the movements. Experimental results show that our low-cost method based on RGB video data can achieve similar results to commercial motion capture platform with RGB-D video data.

Highlights

  • In the digital video industry, motion capture methods based on high-speed cameras and key-point markers on the body surface are widely used [1]

  • 3D human poses from unmarked RGB video sequences. This approach mainly includes three parts: end-to-end 3D human pose estimation based on single frame images, human-body pose generation based on video streams, and motion capture experiment verification based on 3D human pose recovery

  • The traditional method of manually defining rules is replaced by generative adversarial network (GAN) constraints, which is used to generate a more human-like 3D human model; Combining the correlation of video sequences, a 3D human pose recovery method based on video streams is proposed, which uses the correlation between videos for smoothing to generate a more stable 3D human pose; Using two Kinect devices and iPi Soft series software to build a platform for motion capture experiments, an approach of using RGB-D video sequences is compared with the proposed approach of using RGB video sequences to verify the effectiveness of the proposed solution

Read more

Summary

Introduction

In the digital video industry, motion capture methods based on high-speed cameras and key-point markers on the body surface are widely used [1]. This approach has the following disadvantages: (1) The markers on the human-body surface limits the freedom of human-body movement; (2) A controlled and restricted site is needed to obtain high-accuracy data; (3) Calculation is very time consuming, etc. The unmarked motion capture method based on ordinary optical cameras is convenient and inexpensive.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call