Dynamic Hand Gesture Recognition for Indian Sign Language using Integrated CNN-LSTM Architecture

Pradip Patel,Narendra Patel Narendra Patel

doi:10.47164/ijngc.v14i4.1039

Pradip Patel, Narendra Patel Narendra Patel

Open Access

https://doi.org/10.47164/ijngc.v14i4.1039

Copy DOI

Abstract

Human Centered Computing is an emerging research field that aims to understand human behavior. Dynamic hand gesture recognition is one of the most recent, challenging and appealing application in this field. We have proposed one vision based system to recognize dynamic hand gestures for Indian Sign Language (ISL) in this paper. The system is built by using a unified architecture formed by combining Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). In order to hit the shortage of a huge labeled hand gesture dataset, we have created two different CNN by retraining a well known image classification networks GoogLeNet and VGG16 using transfer learning. Frames of gesture videos are transformed into features vectors using these CNNs. As these videos are prearranged series of image frames, LSTM model have been used to join with the fully-connected layer of CNN. We have evaluated the system on three different datasets consisting of color videos with 11, 64 and 8 classes. During experiments it is found that the proposed CNN-LSTM architecture using GoogLeNet is fast and efficient having capability to achieve very high recognition rates of 93.18%, 97.50%, and 96.65% on the three datasets respectively.

Full Text