Abstract

Speech impairment is a kind of disability, affects individual's ability to communicate with each other. People with this problem use sign language for their communication. Though communication through sign language has been taken care of, there exists communication gap between signed and non-signed people. To overcome this type of complexity researchers are trying to develop systems using deep learning approach. The main objective of this paper is subject to implement a vision-based application that offers translation of sign language to voice message and text to reduce the gap between two kinds of people mentioned above. The proposed model extracts temporal and spatial features after taking video sequences. To extract the spatial features, MediaPipe Holistic has been used that consists of several solutions for the detecting face, had and pose landmarks. Different kind of RNN (Recurrent Neural Network) like LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) have been used is to train on temporal features. By using both models and American Signed Language, 99% accuracy has been achieved. The experimental result shows that the recognition method with MediaPipe Holistic followed by GRU or LSTM can achieve a high recognition rate that meets the need of a Sign Language Recognition system that on the real-time basis. Based on the expectation, this analysis will facilitate creation of intelligent- based Sign Language Recognition systems and knowledge accumulation and provide direction to guide to the correct path.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call