Abstract

3D human pose estimation is a challenging problem due to the diversity of poses, human appearance, clothing, occlusion, etc. In this work, we split the problem into two stages, 2D human pose estimation and 3D pose recovery, and address it by a network based on dilated convolution for videos. We introduce pruning layer to prevent overfitting, which performs better than dropout, because the strategy of pruning is not to drop nodes randomly, but to choose the lower-weight ones. We also employ quantization to accelerate it in the smart shop environment, which gains a trade off between performance and computational complexity, and the accuracy loss is in an acceptable range. In addition, labeled human pose datasets are so limited and expensive especially for 3D poses, we propose a 3D human pose dataset named HRI-I in a smart shop environment, which contains more than 16k poses, 26 people and 6 scenarios of walking, hunkering, fetching objects, etc. We train and test our model on the HumanEva-I, Human3.6M and our proposed HRI-I, it demonstrates that the proposed method is efficient and effective.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.