AbstractHand gesture recognition has gained a lot of attention in computer vision due to multiple applications. Further, most of the existing works utilized RGB data for hand gesture recognition. However, RGB cameras mainly depend on lighting, angles, and other factors including skin color which impacts the accuracy. Thus, we propose a methodology for video hand gesture recognition using thermal data in this work. Initially, we created a dataset of short video sequences captured from a thermal camera. Thereafter, a lightweight convolutional neural network model (CNN) is proposed for hand gesture recognition. Further, the performance of the proposed CNN model is evaluated on different sizes of the dataset consisting of 15, 10, and 5 frames per sequence. Results show that the proposed model achieves an accuracy of $$97\% \pm (0.05)$$, $$96\% \pm (0.05)$$, and $$87\% \pm (0.1)$$ on the dataset consisting of 15, 10, and 5 frames per sequence, respectively.
Read full abstract