Automatic speech recognition (ASR) can potentially help older adults and people with disabilities reduce their dependence on others and increase their participation in society. However, maxillectomy patients with reduced speech intelligibility may encounter some problems using such technologies. To investigate the accuracy of three commonly used ASR platforms when used by Japanese maxillectomy patients with and without their obturator placed. Speech samples were obtained from 29 maxillectomy patients with and without their obturator and 17 healthy volunteers. The samples were input into three speaker-independent speech recognition platforms and the transcribed text was compared with the original text to calculate the syllable error rate (SER). All participants also completed a conventional speech intelligibility test to grade their speech using Taguchi's method. A comprehensive articulation assessment of patients without their obturator was also performed. Significant differences in SER were observed between healthy and maxillectomy groups. Maxillectomy patients with an obturator showed a significant negative correlation between speech intelligibility scores and SER. However, for those without an obturator, no significant correlations were observed. Furthermore, for maxillectomy patients without an obturator, significant differences were found between syllables grouped by vowels. Syllables containing /i/, /u/ and /e/ exhibited higher error rates compared to those containing /a/ and /o/. Additionally, significant differences were observed when syllables were grouped by consonant place of articulation and manner of articulation. The three platforms performed well for healthy volunteers and maxillectomy patients with their obturator, but the SER for maxillectomy patients without their obturator was high, rendering the platforms unusable. System improvement is needed to increase accuracy for maxillectomy patients.
Read full abstract