Abstract

In this work, we propose a novel technique for accurately constructing 3D human poses based on mobile phone camera recordings. From the originally recorded video frames by a mobile phone camera, firstly a mask R-CNN network is applied to detect the human body and extract 2D body skeletons. Based on the 2D skeletons, a temporal convolutional network (TCN) is then applied to lift 2D skeletons to 3D ones for the 3D human pose estimation. From the experimental evaluations, it is shown that 3D human poses can be accurately reconstructed by the proposed technique in this work based on mobile phone camera recordings while the reconstruction result is very close to the one by a specialized motion capture system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call