Abstract

Recognizing hand-object interactions presents a significant challenge in computer vision. It arises due to the varying nature of hand-object interactions. Moreover, estimating the 3D position of a hand from a single frame can be problematic, especially when the hand obstructs the view of the object from the observer's perspective. In this article, we present a novel approach to recognizing objects and facilitating virtual interactions, using a steering wheel as an illustrative example. We propose a real-time solution for identifying hand-object interactions in eXtended reality (XR) environments. Our approach relies on data captured by a single RGB camera during a manipulation scenario involving a steering wheel. Our model pipeline consists of three key components: (a) a hand landmark detector based on the MediaPipe cross-platform hand tracking solution; (b) a three-spoke steering wheel model tracker implemented using the faster region-based convolutional neural network (Faster R-CNN) architecture; and (c) a gesture recognition module designed to analyze interactions between the hand and the steering wheel. This approach not only offers a realistic experience of interacting with steering-based mechanisms but also contributes to reducing emissions in the real-world environment. Our experimental results demonstrate the natural interaction between physical objects in virtual environments, showcasing precision and stability in our system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call