Abstract
In this paper, we propose a deep-learning-based approach for real-time hand pose estimation from a single depth image by using 3D convolutional neural network which takes a 3D voxelized grid generated by a depth image as input. Most of the previous works for hand pose estimation only take a single 2D depth image as input and estimate coordinates of the key points of a hand with 2D convolutional neural network. The disadvantage of those methods is that 2D depth image can not represent the spatial information of 3D data, while the 3D voxelized grid can represent the point cloud of the surface of the hand in a spatial way. Hence, we design a 3D convolutional neural network which takes a 3D voxelized grid with data padding as input and steadying the hand skeleton with an additional loss function for regression. Experiments show that our approach outperforms previous methods on two public datasets and can run in real time with a single GPU.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.