Speech Processing Tasks Research Articles

Cleft lip and palate is one of the most common oral and maxillofacial deformities associated with a variety of functional disorders. Cleft palate speech disorder (CPSD) occurs the most frequently and manifests a series of characteristic speech features, which are called cleft speech characteristics. Some scholars believe that children with CPSD and poor speech outcomes may also have weaknesses in speech input processing ability, but evidence is still lacking so far. (1) To explore whether children with CPSD and speech output disorders also have defects in speech input processing abilities; (2) to explore the correlation between speech input and output processing abilities. Children in the experimental group were enrolled from Beijing Stomatological Hospital, Capital Medical University, and healthy volunteers were recruited as controls. Then three tasks containing real and pseudo words were performed sequentially. Reaction time, accuracy and other indicators in three tasks were collected and then analysed. The indicators in the experimental group were significantly lower than those in the control group. There was a strong correlation between speech input and output processing tasks. The performance of both groups when processing pseudo words in the three tasks was worse than that when dealing with real words. Compared with normal controls, children with CPSD have deficits in both speech input and output processing, and there is a strong correlation between speech input and output speech processing abilities. In addition, the pseudo words task was more challenging than the real word task for both groups. What is already known on the subject Children with cleft lip and palate often have speech sound disorders known as cleft palate speech disorder (CPSD). CPSD is characterised by consonant errors called cleft speech characteristics, which can persist even after surgery. Some studies suggest that poor speech outcomes in children with CPSD may be associated with deficits in processing speech input. However, this has not been validated in mainland China. What this paper adds to existing knowledge The results of our study indicate that children with CPSD exhibit poorer performance in three tasks assessing speech input and output abilities compared to healthy controls, suggesting their deficits in both speech input and output processing. Furthermore, a significant correlation was observed between speech input and output processing abilities. Additionally, both groups demonstrated greater difficulty in processing pseudo words compared to real words, as evidenced by their worse performance in dealing with pseudo words. What are the potential or actual clinical implications of this work? The pseudo word tasks designed and implemented in our study can be employed in future research and assessment of speech input and output abilities in Chinese Mandarin children with CPSD. Additionally, our findings revealed the significance of considering both speech output processing abilities and potential existence of speech input processing ability for speech and language therapists when evaluating and developing treatment options for children with CPSD as these abilities are also important for the development of literacy development.

Read full abstract

Aim/Objective Within the dynamic healthcare technology landscape, this research aims to explore patient inquiries within outpatient clinics, elucidating the interplay between technology and healthcare intricacies. Building upon the initial intelligent guidance robot implementation shortcomings, this investigation seeks to enhance informatic robots with voice recognition technology. The objective is to analyze users' vocal patterns, discern age-associated vocal attributes, and facilitate age differentiation through subtle vocal nuances to enhance the efficacy of human-robot communication within outpatient clinical settings. Methods This investigation employs a multi-faceted approach. It leverages voice recognition technology to analyze users' vocal patterns. A diverse dataset of voice samples from various age groups was collected. Acoustic features encompassing pitch, formant frequencies, spectral characteristics, and vocal tract length are extracted from the audio samples. The Mel Filterbank and Mel-Frequency Cepstral Coefficients (MFCCs) are employed for speech and audio processing tasks alongside machine learning algorithms to assess and match vocal patterns to age-related traits. Results The research reveals compelling outcomes. The incorporation of voice recognition technology contributes to a significant improvement in human-robot communication within outpatient clinical settings. Through accurate analysis of vocal patterns and age-related traits, informatic robots can differentiate age through nuanced verbal cues. This augmentation leads to enhanced contextual understanding and tailored responses, significantly advancing the efficiency of patient interactions with the robots. Conclusion Integrating voice recognition technology into informatic robots presents a noteworthy advancement in outpatient clinic settings. By enabling age differentiation through vocal nuances, this augmentation enhances the precision and relevance of responses. The study contributes to the ongoing discourse on the dynamic evolution of healthcare technology, underscoring the complex synergy between technological progression and the intricate realities within healthcare infrastructure. As healthcare continues to metamorphose, the seamless integration of voice recognition technology marks a pivotal stride in optimizing human-robot communication and elevating patient care within outpatient settings.

Read full abstract

Speech Processing Tasks Research Articles

Related Topics

Articles published on Speech Processing Tasks

Comparison of wav2vec 2.0 models on three speech processing tasks

Impaired speech input and output processing abilities in children with cleft palate speech disorder.

Effective Monoaural Speech Separation through Convolutional Top-Down Multi-View Network

Gammatonegram representation for end-to-end dysarthric speech processing tasks: speech recognition, speaker identification, and intelligibility assessment

A Bilingual Basque–Spanish Dataset of Parliamentary Sessions for the Development and Evaluation of Speech Technology

Transfer learning methods for low-resource speech accent recognition: A case study on Vietnamese language

Voice separation and recognition using machine learning and deep learning a review paper

Optimizing Voice Recognition Informatic Robots for Effective Communication in Outpatient Settings.

A Novel Bi-Dual Inference Approach for Detecting Six-Element Emotions

The NEF-SPA Approach as a Framework for Developing a Neurobiologically Inspired Spiking Neural Network Model for Speech Production.

Stable eye versus mouth preference in a live speech-processing task

Random Cycle Loss and Its Application to Voice Conversion

Does Assessor Masking Affect Kindergartners' Performance on Oral Language Measures? A COVID-19 Era Experiment With Children From Diverse Home Language Backgrounds.

A review of deep learning techniques for speech processing

Multi-Scale Feature Learning for Language Identification of Overlapped Speech

Speaker Verification Based on Single Channel Speech Separation

Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment

Global–Local Self-Attention Based Transformer for Speaker Verification

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Speech processing performance of Hungarian-speaking twins and singletons

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speech Processing Tasks Research Articles

Related Topics

Articles published on Speech Processing Tasks

Comparison of wav2vec 2.0 models on three speech processing tasks

Impaired speech input and output processing abilities in children with cleft palate speech disorder.

Effective Monoaural Speech Separation through Convolutional Top-Down Multi-View Network

Gammatonegram representation for end-to-end dysarthric speech processing tasks: speech recognition, speaker identification, and intelligibility assessment

A Bilingual Basque–Spanish Dataset of Parliamentary Sessions for the Development and Evaluation of Speech Technology

Transfer learning methods for low-resource speech accent recognition: A case study on Vietnamese language

Voice separation and recognition using machine learning and deep learning a review paper

Optimizing Voice Recognition Informatic Robots for Effective Communication in Outpatient Settings.

A Novel Bi-Dual Inference Approach for Detecting Six-Element Emotions

The NEF-SPA Approach as a Framework for Developing a Neurobiologically Inspired Spiking Neural Network Model for Speech Production.

Stable eye versus mouth preference in a live speech-processing task

Random Cycle Loss and Its Application to Voice Conversion

Does Assessor Masking Affect Kindergartners' Performance on Oral Language Measures? A COVID-19 Era Experiment With Children From Diverse Home Language Backgrounds.

A review of deep learning techniques for speech processing

Multi-Scale Feature Learning for Language Identification of Overlapped Speech

Speaker Verification Based on Single Channel Speech Separation

Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment

Global–Local Self-Attention Based Transformer for Speaker Verification

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Speech processing performance of Hungarian-speaking twins and singletons