Consistent-Resolution Network for 3D Hand Shape Estimation from a Single RGB Image

Qi Wu,Xianjun Yang,Joya Chen,Xu Zhou,Jianguo Wang,Zhiming Yao,Shaonan Wang

doi:10.1088/1742-6596/1631/1/012014

Abstract

We propose a novel method for 3D hand shape estimation from a single RGB image. Most exiting methods leverage a deep network to extract a low-resolution representation to estimate 3D coordinates, which always leads to the loss of spatial information. In contrast, we present a Consistent-Resolution Network (CRNet) to extract the same resolution representation as the original image, thus preserve more details about spatial information. Specifically, we introduce the recent high-resolution network (HRNet) to generate high-resolution feature maps, which can attain high-resolution representation of the original image. Then, we design a deconvolution module to recover this map to the size of the original image. Therefore, we can directly leverage this feature to learn the precise 2D shape and the depth map, and transfer them into 3D coordinates in the camera space. Through extensive experiments on a large real-world dataset FreiHAND, we show that our proposed method can predict precise and suitable 3D hand shape from a monocular view.

Full Text