Sign language, often referred to as silent conversation, serves as a visual gesture-based primary communication medium for hearing-impaired individuals. People unable to speak or hear communicate with themselves and general persons with static and dynamic sign languages. To ensure that people who are deaf and unable to talk do not feel excluded from this fast-paced world, it is even more crucial that sign language is incorporated into our culture and that a cost-effective and straightforward method of detecting it be developed. With this goal in mind, an automatic Bangla sign language (BSL) detection system has been developed in this work using deep learning approaches and a Jetson Nano edge device. The deep learning models have been trained with Okkhornama, an open-source database and a custom author-curated dataset of 49 categories and 3,760 images have been used to validate these techniques. These images are preprocessed by auto-orienting and resizing into 416 × 416 pixels in the Roboflow framework. Next, we implemented Detectron2, EfficientDet-D0 with TensorFlow and the PyTorch-built YOLOv7 models. Jetson Nano, a portable and sufficiently dynamic NVIDIA microprocessor, is also utilized in this project to generate an object detection model and infer it from test images in real-time. The Dectectron2 model performed best concerning detection accuracy, with a mAP@.5 of 94.915 and an AP of 54.814. The YOLOv7 model achieved mAP@.5 accuracies in detecting various classes from 85 percent to 97 percent and mAP@.5-.95 from 41 percent to 53 percent. Finally, the YOLOv7 Tiny approach has been used in the Jetson Nano edge device for real-time Bangla sign language detection because of its lowest total training time and frame rate. The proposed system is expected to provide an effective solution for Bangladeshi people suffering from hearing disabilities and the inability to speak in a simple and affordable approach.
Read full abstract