3D Hand Pose Estimation From Monocular RGB With Feature Interaction Module

Shaoxiang Guo,Eric Rigall,Yakun Ju,Junyu Dong

doi:10.1109/tcsvt.2022.3142787

Abstract

3D hand pose estimation from a monocular RGB image is a highly challenging task due to self-occlusion, diverse appearances, and inherent depth ambiguities within monocular images. Most of the previous methods first employ deep neural networks to fit 2D joint location maps, then combines them with implicit or explicit pose-aware features to directly regress 3D hand joints positions using their designed network structure. However, the skeleton positions and corresponding skeleton-aware content information located in the latent space are invariably ignored. These skeleton-aware contents effectively bridge the gap between hand joint and hand skeleton information by associating the relationship between different hand joints features and the hand skeleton positions distribution in 2D space. To address this issue, we propose a simple yet efficient deep neural network to directly recover reliable 3D hand pose from monocular RGB images with faster estimation process. Our purpose is the reduction of the model computational complexity while maintaining high precision performance. Therefore, we design a novel Feature Chat Block (FCB) to complete feature boosting, which enables the intuitively enhanced interaction between joint and skeleton features. First, this FCB module updates joint features effectively based on semantic graph convolutional neural network and multi-head self-attention mechanism. The GCN-based structure focuses on the physical hand joints included in a binary adjacency matrix and the self-attention part pays attention to hand joints located in a complementary matrix. Then, the FCB module employs query and key mechanisms respectively representing joint and skeleton features to further implement feature interaction. After a set of FCB modules, our model updates the fused features in a coarse-to-fine manner and finally outputs a predicted 3D hand pose. We conducted a comprehensive set of ablation experiments on the InterHand2.6M dataset to validate the effectiveness and significance of the proposed method. Additionally, experimental results on Rendered Hand Dataset, Stereo Hand Datasets, First-Person Hand Action Dataset and FreiHAND Dataset show our model surpasses the state-of-the-art methods with faster inference speed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

3D Hand Pose Estimation From Monocular RGB With Feature Interaction Module

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Aug 1, 2022
Citations: 12

Similar Papers

Real Time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks
Geon Gi ... Hye Min Park
-
Geon Gi, et. al.Geon Gi ... Hye Min Park
06 Jun 2019
06 Jun 2019

Dual-Hand Motion Capture by Using Biological Inspiration for Bionic Bimanual Robot Teleoperation.
Qing Gao ... Tianwei Zhang
Cyborg and Bionic Systems | VOL. 4
Qing Gao, et. al.Qing Gao ... Tianwei Zhang
01 Jan 2023
Cyborg and Bionic Systems | VOL. 4

3D hand pose and mesh estimation via a generic Topology-aware Transformer model.
Shaoqi Yu ... Yintong Wang
Frontiers in Neurorobotics | VOL. 18
Shaoqi Yu, et. al.Shaoqi Yu ... Yintong Wang
03 May 2024
Frontiers in Neurorobotics | VOL. 18

3D Hand Pose Estimation in Point Cloud Using 3D Convolutional Neural Network on Egocentric Datasets
Lê Văn Hùng
Journal of Research and Development on Information and Communication Technology | VOL. 2020
Lê Văn HùngLê Văn Hùng
03 May 2021
Journal of Research and Development on Information and Communication Technology | VOL. 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

3D Hand Pose Estimation From Monocular RGB With Feature Interaction Module

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology