Abstract
Vision-based detection of fingertips is useful for freehand Human-Computer Interaction (HCI)—especially in virtual, augmented, and mixed reality—to have a seamless experience. The estimation of fingertips position in an RGB image involves overcoming various challenges like occlusion, appearance ambiguities, etc. The general approach relies on a two-stage pipeline involving hand location and detection of fingertips for a single hand. This paper presents an effective single-stage Convolutional Neural Network (CNN) for the detection of fingertips of both hands. We use a set of reference points, referred to as pose particles, and train a CNN model end-to-end to find the N-nearest particles in the proximity of each fingertip. Moreover, the same CNN model is used to compute the position vector’s components with reference to these N-nearest neighbors. Finally, a fingertip position is estimated by computing the centroid of all the points given by these position vectors. With the proposed approach, it is possible to estimate the fingertips position for single or double hands. Moreover, there is no requirement for prior hand localization. We demonstrated the feasibility and effectiveness of the proposed methodology by performing experiments on three different datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.