Abstract

Human Centered Computing is an emerging research field that aims to understand human behavior. Dynamic hand gesture recognition is one of the most recent, challenging and appealing application in this field. We have proposed one vision based system to recognize dynamic hand gestures for Indian Sign Language (ISL) in this paper. The system is built by using a unified architecture formed by combining Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). In order to hit the shortage of a huge labeled hand gesture dataset, we have created two different CNN by retraining a well known image classification networks GoogLeNet and VGG16 using transfer learning. Frames of gesture videos are transformed into features vectors using these CNNs. As these videos are prearranged series of image frames, LSTM model have been used to join with the fully-connected layer of CNN. We have evaluated the system on three different datasets consisting of color videos with 11, 64 and 8 classes. During experiments it is found that the proposed CNN-LSTM architecture using GoogLeNet is fast and efficient having capability to achieve very high recognition rates of 93.18%, 97.50%, and 96.65% on the three datasets respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call