Latent Distribution-Based 3D Hand Pose Estimation From Monocular RGB Images

Moran Li,Nong Sang,Jialong Wang

doi:10.1109/tcsvt.2021.3055862

Abstract

In this article, we propose a novel compressed latent distribution representation for 3D hand pose estimation from monocular RGB images to alleviate the channel correspondence problem. The channel correspondence problem occurs when the 2D and depth coordinates are estimated from independent feature maps, which means the 2D and depth channel sequences may not match during the cross-dataset inference. In contrast, we propose a compressed latent distribution representation that the 2D and depth feature maps for each joint are interconnected and inter-constrained more directly, effectively alleviating the channel correspondence problem and improving cross-dataset performance. Moreover, we design an efficient encoder-decoder network that can maintain the resolution of feature maps to enable better hand feature extraction from monocular RGB images. In this work, the overall pipeline contains two branches: one is the 2D hand pose estimation branch based on a latent heatmap representation (LHR); the other is the 3D hand pose estimation branch based on our proposed latent distribution representation (LDR). In this way, the 2D estimation branch serves as guidance for the 3D branch, which simplifies the optimization of the overall network and results in a more rapid convergence during training. The results on several benchmark datasets (including STB, RHD, and the most recently released InterHand2.6M) demonstrate that our proposed method achieves state-of-the-art (SOTA) performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Latent Distribution-Based 3D Hand Pose Estimation From Monocular RGB Images

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Feb 2, 2021
Citations: 11

Similar Papers

Real Time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks
Geon Gi ... Hye Min Park
-
Geon Gi, et. al.Geon Gi ... Hye Min Park
06 Jun 2019
06 Jun 2019

Cascaded Hierarchical CNN for RGB-Based 3D Hand Pose Estimation
Shiming Dai ... Lili Fan
Mathematical Problems in Engineering | VOL. 2020
Shiming Dai, et. al.Shiming Dai ... Lili Fan
15 Jul 2020
Mathematical Problems in Engineering | VOL. 2020

Hand pose estimation in depth image using CNN and random forest
Xi Chen ... Zhiwen Fang
-
Xi Chen, et. al.Xi Chen ... Zhiwen Fang
08 Mar 2018
08 Mar 2018

3D hand pose and shape estimation from RGB images for keypoint-based hand gesture recognition
Danilo Avola ... Daniele Pannone
Pattern Recognition | VOL. 129
Danilo Avola, et. al.Danilo Avola ... Daniele Pannone
30 Apr 2022
Pattern Recognition | VOL. 129

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Latent Distribution-Based 3D Hand Pose Estimation From Monocular RGB Images

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology