Abstract

Recent advances in smart sensor technology and computer vision techniques have made the tracking of unmarked human hand and finger movements possible with high accuracy and at sampling rates of over 120 Hz. However, these new sensors also present challenges for real-time gesture recognition due to the frequent occlusion of fingers by other parts of the hand. We present a novel multisensor technique that improves the pose estimation accuracy during real-time computer vision gesture recognition. A classifier is trained offline, using a premeasured artificial hand, to learn which hand positions and orientations are likely to be associated with higher pose estimation error. During run-time, our algorithm uses the prebuilt classifier to select the best sensor-generated skeletal pose at each time step, which leads to a fused sequence of optimal poses over time. The artificial hand used to establish the ground truth is configured in a number of commonly used hand poses such as pinches and taps. Experimental results demonstrate that this new technique can reduce total pose estimation error by over 30% compared with using a single sensor, while still maintaining real-time performance. Our evaluations also demonstrate that our approach significantly outperforms many other alternative approaches such as weighted averaging of hand poses. An analysis of our classifier performance shows that the offline training time is insignificant, and our configuration achieves about 90.8% optimality for the dataset used. Our method effectively increases the robustness of touchless display interactions, especially in high-occlusion situations by analyzing skeletal poses from multiple views.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call