Abstract

We present MVPose, a novel system designed to enable real-time multi-person pose estimation (PE) on commodity mobile devices, which consists of three novel techniques. First, MVPose takes a motion-vector-based approach to fast and accurately track the human keypoints across consecutive frames, rather than running expensive human-detection model and pose-estimation model for every frame. Second, MVPose designs a mobile-friendly PE model that uses lightweight feature extractors and multi-stage network to significantly reduce the latency of pose estimation without compromising the model accuracy. Third, MVPose leverages the heterogeneous computing resources of both CPU and GPU to execute the pose estimation model for multiple persons in parallel, which further reduces the total latency. We present extensive experiments to evaluate the effectiveness of the proposed tecniques by implemented the MVPose on five off-the-shelf commercial smartphones. Evaluation results show that MVPose achieves over 30 frames per second PE with 4 persons per frame, which significantly outperforms the state-of-the-art baseline, with a speedup of up to 5.7 and 3.8 in latency on CPU and GPU, respectively. Compared with baseline, MVPose achieves an improvement of 10.1% in multi-person PE accuracy. Furthermore, MVPose achieves up to 74.3% and 57.6% energy-per-frame saving on average.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.