Abstract
Hand gesture interpretation is an open research problem in Human Computer Interaction (HCI), which involves locating gesture boundaries (Gesture Spotting) in a continuous video sequence and recognizing the gesture. Existing techniques model each gesture as a temporal sequence of visual features extracted from individual frames which is not efficient due to the large variability of frames at different timestamps. In this paper, we propose a new sub-gesture modeling approach which represents each gesture as a sequence of fixed sub-gestures (a group of consecutive frames with locally coherent context) and provides a robust modeling of the visual features. We further extend this approach to the task of gesture spotting where the gesture boundaries are identified using a filler model and gesture completion model. Experimental results show that the proposed method outperforms state-of-the-art Hidden Conditional Random Fields (HCRF) based methods and baseline gesture spotting techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.