Automatic Speech Recognition in Primary Progressive Apraxia of Speech.

Katerina A Tetzloff,Daniela Wiepert,Hugo Botha,Joseph R Duffy,Heather M Clark,Jennifer L Whitwell,Keith A Josephs,Rene L Utianski

doi:10.1044/2024_jslhr-24-00049

Abstract

Transcribing disordered speech can be useful when diagnosing motor speech disorders such as primary progressive apraxia of speech (PPAOS), who have sound additions, deletions, and substitutions, or distortions and/or slow, segmented speech. Since transcribing speech can be a laborious process and requires an experienced listener, using automatic speech recognition (ASR) systems for diagnosis and treatment monitoring is appealing. This study evaluated the efficacy of a readily available ASR system (wav2vec 2.0) in transcribing speech of PPAOS patients to determine if the word error rate (WER) output by the ASR can differentiate between healthy speech and PPAOS and/or among its subtypes, whether WER correlates with AOS severity, and how the ASR's errors compare to those noted in manual transcriptions. Forty-five patients with PPAOS and 22 healthy controls were recorded repeating 13 words, 3 times each, which were transcribed manually and using wav2vec 2.0. The WER and phonetic and prosodic speech errors were compared between groups, and ASR results were compared against manual transcriptions. Mean overall WER was 0.88 for patients and 0.33 for controls. WER significantly correlated with AOS severity and accurately distinguished between patients and controls but not between AOS subtypes. The phonetic and prosodic errors from the ASR transcriptions were also unable to distinguish between subtypes, whereas errors calculated from human transcriptions were. There was poor agreement in the number of phonetic and prosodic errors between the ASR and human transcriptions. This study demonstrates that ASR can be useful in differentiating healthy from disordered speech and evaluating PPAOS severity but does not distinguish PPAOS subtypes. ASR transcriptions showed weak agreement with human transcriptions; thus, ASR may be a useful tool for the transcription of speech in PPAOS, but the research questions posed must be carefully considered within the context of its limitations. https://doi.org/10.23641/asha.26359417.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Speech Recognition in Primary Progressive Apraxia of Speech.

Abstract

Talk to us

Similar Papers

More From: Journal of speech, language, and hearing research : JSLHR

Lead the way for us

Similar Papers

A Deep Learning System for Domain-Specific Speech Recognition
Yanan Jia
-
Yanan JiaYanan Jia
22 Jul 2023
22 Jul 2023

What does parity mean? A detailed comparison of ASR and human transcription errors
Courtney Mansfield ... Mari Ostendorf
The Journal of the Acoustical Society of America | VOL. 150
Courtney Mansfield, et. al.Courtney Mansfield ... Mari Ostendorf
01 Oct 2021
The Journal of the Acoustical Society of America | VOL. 150

Using HIPAA (Health Insurance Portability and Accountability Act)-Compliant Transcription Services for Virtual Psychiatric Interviews: Pilot Comparison Study.
Salman Seyedi ... Zifan Jiang
JMIR Mental Health | VOL. 10
Salman Seyedi, et. al.Salman Seyedi ... Zifan Jiang
31 Oct 2023
JMIR Mental Health | VOL. 10

Training RNN language models on uncertain ASR hypotheses in limited data scenarios
Imran Sheikh ... Irina Illina
Computer Speech & Language | VOL. 83
Imran Sheikh, et. al.Imran Sheikh ... Irina Illina
20 Aug 2023
Computer Speech & Language | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Speech Recognition in Primary Progressive Apraxia of Speech.

Abstract

Talk to us

Similar Papers

More From: Journal of speech, language, and hearing research : JSLHR