Word Error Rate Research Articles

In this study, we aimed to (1) review the long-term outcomes of cochlear implantation in children with cochlear nerve aplasia and (2) compare the development of their auditory and speech abilities to children with normal-sized cochlear nerves. This is a retrospective case-control study. Patients who underwent unilateral cochlear implant (CI) surgery in a tertiary referral center from September 2012 to December 2018 were reviewed. The study group included 55 children with cochlear nerve aplasia diagnosed using preoperative images. The control group included 35 children with normal-sized cochlear nerves. The control group did not differ from the study group in terms of age at implantation, pre-implantation auditory and speech abilities, or the electrode array type. Cochlear implantation outcomes were assessed using a test battery, including the Categories of Auditory Performance (CAP) score, the Speech Intelligibility Rating (SIR) score, behavioral audiometry, and closed- or open-set speech recognition tests. The development of auditory and speech abilities was compared between the two groups using Generalized Linear Mixed-effect Models. The mean duration of CI usage was 4.5 years (SD = 1.5, range = 2.0 to 9.5) in the study group. The CAP scores, SIR scores, and aided hearing thresholds improved significantly post-implantation in the study group, but were significantly poorer than those in the control group. Generalized Linear Mixed-effect Models showed that the development of CAP and SIR scores was significantly slower in the study group than in the control group. Overall, 27 (49%) children with cochlear nerve aplasia had some degree of open-set speech perception skills, but the monosyllabic and bisyllabic word recognition rates were significantly lower than those in the control group. For children with cochlear nerve aplasia, auditory perception and speech intelligibility continued to improve in the long-term follow-up, but this progress was significantly slower than in children with normal-sized cochlear nerves. Most children with cochlear nerve aplasia could obtain the ability of common phrase perception and understanding simple spoken language with consistent CI usage and auditory rehabilitation.

AbstractBackgroundScoring verbal cognitive tests with automatic speech recognition (ASR) engines increases the efficiency of scoring and provides word timestamps that enable detailed temporal analyses of spoken responses. Here, we describe consensus ASR (CASR) procedures that incorporate multiple ASR engines to increase transcription and timing accuracy and generate CASR transcript confidence scores.MethodSeven ASR engines produced automatic transcriptions of both speech database samples (GMU Speech Accent Archive and NUS Auditory English Lexicon Project) and verbal test responses of 41 subjects from the California Cognitive Assessment Battery (CCAB). A novel Recognizer Output Voting Error Reduction (ROVER) algorithm was used to mutually align the transcripts, and a Bayesian weighted voting algorithm produced the best CASR transcript, mean word timestamps, and consensus scores. Word error rates (WER) gauged CASR accuracy against either predetermined or manually corrected transcripts.ResultDatabase sentence WERs from 1767 subjects ranged from a mean of 22% (Windows10 UWP) to 6% (Rev.ai) with CASR producing 5%, with no significant gender or age effects but better performance for native english speakers (Figure 1). In CCAB test responses, for limited word response tests CASR WERs ranged from 3% to less than 1% (Figure 2); for expansive word response tests CASR WERs ranged from 8% to 2% (Figure 3); and for discursive speech, CASR WERs ranged from 6% to 5%. Word start time ASR estimates for 594 database words in lists ranged in mean deviations from true times from 250ms std.dev. (Google) to 17ms std.dev. (Amazon) with CASR obtaining 14ms errors (Figure 4). Finally, consensus confidence scores from CCAB test responses, ranging from 0 to 1 (1 = complete agreement across ASR engines), show that CASR words with consensus scores above 0.8 and 0.9 are correct >99% and >99.8% of the time, respectively (Figure 5).ConclusionCASR produces transcripts for verbal test responses accurate enough for estimating scores in most limited word response tests. In large vocabulary response tests, CASR transcripts facilitate quick manual correction, and confidence values can identify transcript words needing manual correction. Patterns in CASR errors also indicate future substantial reductions to CASR WER on a per test basis.

Word Error Rate Research Articles

Articles published on Word Error Rate

Multimodal Fusion Framework Based on Statistical Attention and Contrastive Attention for Sign Language Recognition

So-DAS: A Two-Step Soft-Direction-Aware Speech Separation Framework

A Resilience Evaluation Framework on Ultrasonic Microphone Jammers

Indonesian Automatic Speech Recognition with XLSR-53

Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition

Mandarin Electro-Laryngeal Speech Enhancement Using Cycle-Consistent Generative Adversarial Networks

Combining hybrid DNN-HMM ASR systems with attention-based models using lattice rescoring

End-to-end Jordanian dialect speech-to-text self-supervised learning framework.

Lexicon and attention based handwritten text recognition system

Effects of Sound Quality on the Accuracy of Telephone Captions Produced by Automatic Speech Recognition: A Preliminary Investigation.

Morphology aware data augmentation with neural language models for online hybrid ASR

The development of an automatic speech recognition model using interview data from long-term care for older adults

Assessing the benefits of virtual speaker lateralization for binaural speech intelligibility over the Internet

Long-Term Auditory and Speech Outcomes of Cochlear Implantation in Children With Cochlear Nerve Aplasia.

Code-Switching Automatic Speech Recognition for Nursing Record Documentation: System Development and Evaluation.

Development of Small Vocabulary Continuous Speech-to-Text System for Kannada Language/Dialects

Consensus Automatic Speech Recognition (CASR) in Cognitive Testing

Improving the performance of Uyghur speech recognition based on Factorized Time-Delay Neural Network

Modelling of a Speech-to-Text Recognition System for Air Traffic Control and NATO Air Command

Improving out of vocabulary words recognition accuracy for an end-to-end Russian speech recognition system

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Word Error Rate Research Articles

Articles published on Word Error Rate

Multimodal Fusion Framework Based on Statistical Attention and Contrastive Attention for Sign Language Recognition

So-DAS: A Two-Step Soft-Direction-Aware Speech Separation Framework

A Resilience Evaluation Framework on Ultrasonic Microphone Jammers

Indonesian Automatic Speech Recognition with XLSR-53

Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition

Mandarin Electro-Laryngeal Speech Enhancement Using Cycle-Consistent Generative Adversarial Networks

Combining hybrid DNN-HMM ASR systems with attention-based models using lattice rescoring

End-to-end Jordanian dialect speech-to-text self-supervised learning framework.

Lexicon and attention based handwritten text recognition system

Effects of Sound Quality on the Accuracy of Telephone Captions Produced by Automatic Speech Recognition: A Preliminary Investigation.

Morphology aware data augmentation with neural language models for online hybrid ASR

The development of an automatic speech recognition model using interview data from long-term care for older adults

Assessing the benefits of virtual speaker lateralization for binaural speech intelligibility over the Internet

Long-Term Auditory and Speech Outcomes of Cochlear Implantation in Children With Cochlear Nerve Aplasia.

Code-Switching Automatic Speech Recognition for Nursing Record Documentation: System Development and Evaluation.

Development of Small Vocabulary Continuous Speech-to-Text System for Kannada Language/Dialects

Consensus Automatic Speech Recognition (CASR) in Cognitive Testing

Improving the performance of Uyghur speech recognition based on Factorized Time-Delay Neural Network

Modelling of a Speech-to-Text Recognition System for Air Traffic Control and NATO Air Command

Improving out of vocabulary words recognition accuracy for an end-to-end Russian speech recognition system