Abstract

Human pose estimation is a core component in applications for which some level of human–computer interaction is required, such as assistive robotics, ambient assisted living or the motion capture systems used in biomechanics or video games production. In this paper, we propose an end-to-end pipeline for estimating 3D human poses that works in real-time in an off-the-shelf computer, using as input video sequences captured with a commercial RGBD sensor. Our hybrid approach is composed of two stages: 2D pose estimation using deep neural networks and 3D registration, for which a lightweight algorithm based on classic computer vision techniques has been developed. We compare several 2D pose estimators and validate the performance of our proposed method against the state-of-the-art, using as benchmark an international and publicly available dataset. Our 2D to 3D registration module alone can reach frame rates of up to 99 fps, while achieving an average error per joint of 132 mm. Furthermore, the proposed solution is agnostic to the model used for 2D pose estimation and can be upgraded with new upcoming solutions or adapted for different articulated objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call