Abstract

Estimating 3D interacting hand poses and shapes from a single RGB image is challenging as it is difficult to distinguish the left and right-hands in interacting hand pose analysis. This paper proposes a network called GroupPoseNet using a grouping strategy to address this problem. GroupPoseNet extracts the left- and right-hand features respectively and thus avoids the mutual affection between the interacting hands. Empowered by a novel up-sampling block called MF-Block predicting 2D heat-maps in a progressive way by fusing image features, hand pose features, and multi-scale features, GroupPoseNet is effective and robust to severe occlusions. To achieve an effective 3D hand reconstruction, we design a transformer mechanism based inverse kinematics module(termed TikNet) to map 3D joint locations to hand shape and pose parameters of MANO hand model. Comprehensive experiments on the InterHand2.6M dataset show GroupPoseNet outperforms existing methods by a significant margin. Additional experiments also demonstrate it has a good generalization ability in the problems including left-hand, right-hand and interacting hand pose estimation from a single RGB image. We also show the efficiency of TikNet by the quantitative and qualitative results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.