3D hand pose and mesh estimation via a generic Topology-aware Transformer model.

Shaoqi Yu,Xiaolin Zhang,Lili Chen,Jiamao Li,Yintong Wang

doi:10.3389/fnbot.2024.1395652

Abstract

In Human-Robot Interaction (HRI), accurate 3D hand pose and mesh estimation hold critical importance. However, inferring reasonable and accurate poses in severe self-occlusion and high self-similarity remains an inherent challenge. In order to alleviate the ambiguity caused by invisible and similar joints during HRI, we propose a new Topology-aware Transformer network named HandGCNFormer with depth image as input, incorporating prior knowledge of hand kinematic topology into the network while modeling long-range contextual information. Specifically, we propose a novel Graphformer decoder with an additional Node-offset Graph Convolutional layer (NoffGConv). The Graphformer decoder optimizes the synergy between the Transformer and GCN, capturing long-range dependencies and local topological connections between joints. On top of that, we replace the standard MLP prediction head with a novel Topology-aware head to better exploit local topological constraints for more reasonable and accurate poses. Our method achieves state-of-the-art 3D hand pose estimation performance on four challenging datasets, including Hands2017, NYU, ICVL, and MSRA. To further demonstrate the effectiveness and scalability of our proposed Graphformer Decoder and Topology aware head, we extend our framework to HandGCNFormer-Mesh for the 3D hand mesh estimation task. The extended framework efficiently integrates a shape regressor with the original Graphformer Decoder and Topology aware head, producing Mano parameters. The results on the HO-3D dataset, which contains various and challenging occlusions, show that our HandGCNFormer-Mesh achieves competitive results compared to previous state-of-the-art 3D hand mesh estimation methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

3D hand pose and mesh estimation via a generic Topology-aware Transformer model.

Abstract

Talk to us

Similar Papers

More From: Frontiers in Neurorobotics

Lead the way for us

Journal: Frontiers in Neurorobotics	Publication Date: May 3, 2024
License type: CC BY 4.0

Similar Papers

Real Time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks
Geon Gi ... Hye Min Park
-
Geon Gi, et. al.Geon Gi ... Hye Min Park
06 Jun 2019
06 Jun 2019

Mining Multi-View Information: A Strong Self-Supervised Framework for Depth-based 3D Hand Pose and Mesh Estimation
Pengfei Ren ... Jingyu Wang
-
Pengfei Ren, et. al.Pengfei Ren ... Jingyu Wang
01 Jun 2022
01 Jun 2022

Dual-Hand Motion Capture by Using Biological Inspiration for Bionic Bimanual Robot Teleoperation.
Qing Gao ... Tianwei Zhang
Cyborg and Bionic Systems | VOL. 4
Qing Gao, et. al.Qing Gao ... Tianwei Zhang
01 Jan 2023
Cyborg and Bionic Systems | VOL. 4

Cascaded Hierarchical CNN for RGB-Based 3D Hand Pose Estimation
Shiming Dai ... Lili Fan
Mathematical Problems in Engineering | VOL. 2020
Shiming Dai, et. al.Shiming Dai ... Lili Fan
15 Jul 2020
Mathematical Problems in Engineering | VOL. 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

3D hand pose and mesh estimation via a generic Topology-aware Transformer model.

Abstract

Talk to us

Similar Papers

More From: Frontiers in Neurorobotics