Speech impairment may be infirmities that affect a human’s capability to reach out others via conversation and listening. Humankind in such community, suffering from the above said disability, adopts sign language as an alternate mode of conversation. Even though speech and hearing impaired are using sign language everywhere, the non-sign language speakers face challenges in communicating with speech impaired. Modern approaches in deep learning and computer vision have resulted in a tremendous improvement in the areas of motion and hand sign identification. The main motive presented work is to develop vision-based applications that provide sign language transliteration to speech enabling communication among verbal and non-verbal communicators. The proposed framework uses a methodology of collecting video sequences followed by extracting temporal and spatial features. Recognition of spatial features is compassed using “Convolutional Neural Networks” and to train on temporal features, “Recurrent Neural Networks” were utilized. The databases employed are the American Sign Language datasets.
Read full abstract