Abstract. Sign language recognition is an important technology that makes it possible for ordinary people to be able to communicate with deaf people, fostering inclusivity and accessibility. The came out of deep learning technology has completely changed the field by enabling the automatic extraction and learning of hierarchical features from raw data, leading to significant improvements in recognition accuracy. This paper presents a comprehensive comparative analysis of different Convolutional Neural Network (CNN) architectures for recognizing American Sign Language (ASL) signs. Utilizing the sign language dataset, which contains 24 classes of ASL letters represented by 28x28 grayscale images, the author evaluated the performance of a Basic CNN, a Modified Residual Network (ResNet)-50, and a LeNet-5 model. This study emphasizes the impact of architectural choices on recognition accuracy and computational efficiency. Results indicate that while ResNet-50 demonstrates superior accuracy, fluctuating significantly during initial training, the Basic CNN and LeNet-5 models offer greater stability with slightly lower accuracy. This work concludes that despite the initial challenges, deep learning models, particularly ResNet-50, show promise for ASL recognition, highlighting the need for diverse and enriched datasets to improve model reliability in real-world scenarios.
Read full abstract