Hearing Community Research Articles

Sign language (SL) serves as a visual communication tool bearing great significance for deaf people to interact with others and facilitate their daily life. Wide varieties of SLs and the lack of interpretation knowledge necessitate developing automated sign language recognition (SLR) systems to attenuate the communication gap between the deaf and hearing communities. Despite numerous advanced static SLR systems, they are not practical and favorable enough for real-life scenarios once assessed simultaneously from different critical aspects: accuracy in dealing with high intra- and slight inter-class variations, robustness, computational complexity, and generalization ability. To this end, we propose a novel multi-lingual multi-modal SLR system, namely MLMSign, by taking full strengths of hand-crafted features and deep learning models to enhance the performance and the robustness of the system against illumination changes while minimizing computational cost. The RGB sign images and 2D visualizations of their hand-crafted features, i.e., Histogram of Oriented Gradients (HOG) features and a∗ channel of L∗a∗b∗ color space, are employed as three input modalities to train a novel Convolutional Neural Network (CNN). The number of layers, filters, kernel size, learning rate, and optimization technique are carefully selected through an extensive parametric study to minimize the computational cost without compromising accuracy. The system’s performance and robustness are significantly enhanced by jointly deploying the models of these three modalities through ensemble learning. The impact of each modality is optimized based on their impact coefficient determined by grid search. In addition to the comprehensive quantitative assessment, the capabilities of our proposed model and the effectiveness of ensembling over three modalities are evaluated qualitatively using the Grad-CAM visualization model. Experimental results on the test data with additional illumination changes verify the high robustness of our system in dealing with overexposed and underexposed lighting conditions. Achieving a high accuracy (>99.33%) on six benchmark datasets (i.e., Massey, Static ASL, NUS II, TSL Fingerspelling, BdSL36v1, and PSL) demonstrates that our system notably outperforms the recent state-of-the-art approaches with a minimum number of parameters and high generalization ability over complex datasets. Its promising performance for four different sign languages makes it a feasible system for multi-lingual applications.

Read full abstract

Hearing disabilities affect people all over the world. Nearly 360 million people have a disabling hearing loss. In the recent years, there has been a rapid increase in the number of deaf and dumb victims due to birth defects, accidents and oral diseases. Since deaf and dumb people cannot communicate with a hearing person, they have to depend on some sort of visual communication. Sign Language is not understood by the majority of hearing people. The objective of our project is to form a bridge between the hearing-impaired community and the hearing community; to further commence a two-way communication. We propose a two-way sign language translator that converts Voice to sign language and vice versa in real time. We process the video frame by frame using the OpenCV Library in Python. Furthermore, we use background subtraction method called Gaussian Mixture-based Background/Foreground Segmentation Algorithm on each frame to subtract the background. Contours of this processed image are then passed into a Deep Neural Net (DNN) to classify the frame to the corresponding written language equivalent words. For conversion from sign language to voice, we use basic Natural Language Processing and gtts to accurately conserve the grammar in the sign language. The enrolment rate and literacy among the hearing-impaired children is far below the average for the population at large. This technology will then help the normal schools better integrate the hearing-impaired community thus making education more accessible and cheaper for them. Speech Emotion Recognition (SER) is a vital aspect of human-computer interaction and emotional intelligence applications. This SER that employs Mel- frequency cepstral coefficients (MFCC) as the basis for feature extraction, followed by a Convolutional Neural Network (CNN) with a customized architecture for emotion classification. The methodology 2 involves transforming audio samples into MFCC images, enabling the application of CNNs typically used in computer vision tasks. This fusion of traditional audio analysis with deep learning techniques harnesses the power of MFCC's spectral representation and CNN's spatial pattern recognition. By modifying the CNN architecture, the model can better discern emotional cues in the MFCC images, enhancing SER performance. This strategy offers SER a viable path forward, with the ability to identify complex emotional expressions. In order to increase the accuracy of emotion recognition, it tackles the requirement for feature-rich representations and sophisticated modelling techniques. KEYWORDS: OpenCV, DNN, SER, CNN, MFCC

Read full abstract

Hearing Community Research Articles

Related Topics

Articles published on Hearing Community

SIGN LANGUAGE TO TEXT CONVERTOR

Public common-sense assumptions about mathematics

The ASL Dataset for Real-Time Recognition and Integration with LLM Services

Unveiling Community Needs and Aspirations: Card Sorting as a Research Method for Developing Digital Learning Spaces

SignifAI: French sign language translation based on deep learning and large language models

A Sign Language Recognition using Improved Grey Wolf Optimization based neural networks

A deaf-centred art-science approach to community engagement with sign language technologies

Impact of face swapping and data augmentation on sign language recognition

Sign Language Detection Using Action Recognition LSTM Deep Learning Model

SIGNSense: Auditory -Optic Impairment Communication Bridge

SPEECH TO SIGN CONVERSION USING NATURAL LANGUAGE PROCESSING

MLMSign: Multi-lingual multi-modal illumination-invariant sign language recognition

Sign Speak: Recogninzing Sign Language with Machine Learning

DESAIN MEDIA PEMBELAJARAN BAHASA INGGRIS UNTUK SISWA TUNARUNGU BERBANTUAN TEKNOLOGI INFORMASI DAN KOMUNIKASI

Patient and public involvement and engagement (PPIE): how valuable and how hard? An evaluation of ALL_EARS@UoS PPIE group, 18 months on

Two-Way Sign Language Translator for Visually Impaired People

The Crossover: Integrating Indian Sign Language with the Mudras in an Indian Classical Dance Performance Creation

An Expert System for Indian Sign Language Recognition Using Spatial Attention–based Feature and Temporal Feature

Frequency and Spatial Domain-Based Approaches for Recognition of Indian Sign Language Gestures

Police Interactions With the Deaf and Hard of Hearing Community: Abuse, Audism, and Accessibility

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Hearing Community Research Articles

Related Topics

Articles published on Hearing Community

SIGN LANGUAGE TO TEXT CONVERTOR

Public common-sense assumptions about mathematics

The ASL Dataset for Real-Time Recognition and Integration with LLM Services

Unveiling Community Needs and Aspirations: Card Sorting as a Research Method for Developing Digital Learning Spaces

SignifAI: French sign language translation based on deep learning and large language models

A Sign Language Recognition using Improved Grey Wolf Optimization based neural networks

A deaf-centred art-science approach to community engagement with sign language technologies

Impact of face swapping and data augmentation on sign language recognition

Sign Language Detection Using Action Recognition LSTM Deep Learning Model

SIGNSense: Auditory -Optic Impairment Communication Bridge

SPEECH TO SIGN CONVERSION USING NATURAL LANGUAGE PROCESSING

MLMSign: Multi-lingual multi-modal illumination-invariant sign language recognition

Sign Speak: Recogninzing Sign Language with Machine Learning

DESAIN MEDIA PEMBELAJARAN BAHASA INGGRIS UNTUK SISWA TUNARUNGU BERBANTUAN TEKNOLOGI INFORMASI DAN KOMUNIKASI

Patient and public involvement and engagement (PPIE): how valuable and how hard? An evaluation of ALL_EARS@UoS PPIE group, 18 months on

Two-Way Sign Language Translator for Visually Impaired People

The Crossover: Integrating Indian Sign Language with the Mudras in an Indian Classical Dance Performance Creation

An Expert System for Indian Sign Language Recognition Using Spatial Attention–based Feature and Temporal Feature

Frequency and Spatial Domain-Based Approaches for Recognition of Indian Sign Language Gestures

Police Interactions With the Deaf and Hard of Hearing Community: Abuse, Audism, and Accessibility