Abstract

In sign language alphabet recognition problem, the scope of study limits only static hand gestures which not cover all gestures of sign language. This paper aims to find an approach for recognizing the static and dynamic gestures of American Sign Language (ASL) alphabet and apply GANs to generates synthetic images to increase dataset size. The proposed method combines convolutional neural networks (CNN) with long short-term memory (LSTM) networks to extract the features and classify images of the American Sign Language alphabet along various dimensions. With two consecutive images, this proposed method has an accuracy of over 97% and on 1D vector images, accuracy reaches 90% in large batch size when were tested on various batch sizes and epochs. Thus, this method is more appropriate for two consecutive images than on 1D vector images. For dynamic features, the performance of the proposed CNN-LSTM on two consecutive images is lower than the simple CNN at the beginning epoch, but the accuracy converged quickly, and finally, it reaches to the accuracy of simple CNN in a few epochs. Our proposed approach offers good results and better than simple CNN for dynamic ASL alphabet gestures, especially on 1D vector images.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call