Language experts have recognized sign languages as natural languages with the ability to convey human emotions and ideas. Translation from written language into sign videos or extraction of spoken language sentences from sign videos is the aim of sign language translation. Sign language is the principal means of communication for the deaf and hard of hearing community, which comprises 32 million children and 328 million adults worldwide who suffer from hearing impairment. However, the inability of current systems to accurately translate and transmit sign language motions in real-time prevents effective and spontaneous communication. This research provides a revolutionary technique that enables real-time recognition of ISL gestures by integrating natural language processing with cross-modal integration. The methodology uses cutting-edge methods like the Single Shot Multibox Detector (SSD) with MobileNetV2 architecture for data collection, preprocessing, model selection, and training. In real-time inference, the trained model attains an impressive 94% accuracy rate, showcasing strong performance and encouraging outcomes for enhancing communication accessibility for people with hearing impairments
Read full abstract