Sign language is commonly used by people with hearing and speech impairments, making it difficult for those without such disabilities to understand. However, sign language is not limited to communication within the deaf community alone. It has been officially recognized in numerous countries and is increasingly being offered as a second language option in educational institutions. In addition, sign language has shown its usefulness in various professional sectors, including interpreting, education, and healthcare, by facilitating communication between people with and without hearing impairments. Advanced technologies, such as computer vision and machine learning algorithms, are used to interpret and translate sign language into spoken or written forms. These technologies aim to promote inclusivity and provide equal opportunities for people with hearing impairments in different domains, such as education, employment, and social interactions. In this paper, we implement a DeafTech Vision (DTV-CNN) architecture based on the convolutional neural network to recognize American Sign Language (ASL) gestures using deep learning techniques. Our main objective is to develop a robust ASL sign classification model to enhance human-computer interaction and assist individuals with hearing impairments. Through extensive evaluation, our model consistently outperformed baseline methods in terms of precision. It achieved an outstanding accuracy rate of 99.87% on the ASL alphabet test dataset and 99.94% on the ASL digit dataset, significantly exceeding previous research, which reported an accuracy of 90.00%. We also illustrated the model's learning trends and convergence points using loss and error graphs. These results highlight the DTV-CNN's effectiveness and capability in distinguishing complex ASL gestures.
Read full abstract