Abstract

Human pose estimation can be applied to many computer vision tasks, such as human–computer interaction, motion recognition, and action detection. However, few previous methods focused on the pose estimation problem in crowded scenes. Connection-based bottom-up approaches are the main pipelines in multi-person pose estimation. Keypoint detection, connection detection and pose assembly are the main processes in connection-based methods. However, the prediction accuracy of these three processes in pose estimation will be significantly affected when applied into crowded scenes. In this paper, we utilize an improved method called Keypoint Likelihood Variance Reduction (KLVR) to decode the representation of keypoints to improve keypoint detection accuracy in crowded scenes. Moreover, we perform a noise filter after the keypoint detection process to constrain the noise peak that negatively affects the pose assembling process. In addition, to address the isolated human parts problem in crowded scenes caused by occlusion, we utilize Cycle Skeleton Structure (CSS) for our pose assembling process. In the experiment, our method outperforms previous methods on the CrowdPose test dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.