Abstract
Hand gestures are one of the mediums that many people use to communicate with each other. The use of gesture recognition applications has become increasingly popular in recent years especially in computer vision areas. Typically, gestures can easily be recognized from a single image frame (i.e. alphabet from sign language), however the ability to recognize complex gestures with subtle differences between movement requires more works and larger datasets. In this work, we introduce a simple gesture recognition system that translates 5 different hand gestures, namely “doing other things”, “swiping down”, “swiping left”, “zooming out with two fingers” and “drumming fingers”. We used datasets obtained from Jester dataset. The inputs were processed in ‘RGB’ format during the pre-processing phase and a spatiotemporal filter were used as a feature extraction method, which were also the main building block in this system. Next, we trained the features using 3D Convolution Neural Network (3D-CNN). Further, we used real-time video to test the developed recognition system with 5 different actors. Findings show that the developed model can translate hand gestures with accuracy of 85.70% and 0.4% losses.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have