Abstract

Real-time hand gesture recognition in complex environments has many challenges, such as poor real-time performances and robustness to environmental changes. This paper takes the hand gesture control of the unmanned vehicle as the application background, and focuses on the gesture detection and recognition of video streams based on deep learning in the complex environment. In this paper, we detect the hand in a complex environment by training the ssd_mobilenet model, and initialize the tracking with kalman filter. Then, we detect the hand keypoints by following the architecture of Convolutional Pose Machines (CPMs), in order to obtain the belief maps for all keypoints that are used as the train sets of Convolutional Neural Networks (CNNs). Finally, based on the results obtained by our classification, this paper proposes a method of multi-frame recursion to minimize the influences of redundant frames and error frames. In this paper, eight kinds of gestures for controlling vehicle are identified. The experimental results show that our method can successfully realize real-time hand gesture recognition in the video streams. The recognition accuracy can reach 96.7%, and the average recognition speed reaches 12 fps, which basically meets the real-time requirements and successfully applys to mobile terminals such as TX2 for engineering practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call