Abstract

The sophistication of artificial intelligence (AI) technologies has significantly advanced in the past decade. However, the observed unpredictability and variability of AI behavior in noisy signals is still underexplored and represents a challenge when trying to generalize AI behavior to real-life environments, especially for people with a speech disorder, who already experience reduced speech intelligibility. In the context of developing assistive technology for people with Parkinson's disease using automatic speech recognition (ASR), this pilot study reports on the performance of Google Cloud speech-to-text technology with dysarthric and healthy speech in the presence of multi-talker babble noise at different intensity levels. Despite sensitivities and shortcomings, it is possible to control the performance of these systems with current tools in order to measure speech intelligibility in real-life conditions.

Highlights

  • Parkinson’s disease is the second most common neurodegenerative disorder, following Alzheimer’s disease (Dorsey et al, 2007), with a prevalence of more than six million people worldwide (Dorsey et al, 2018)

  • Understanding the sensitivity of deep neural networks (DNN) to various application-specific types of noise and establishing protocols to ameliorate response variability can help generalize artificial intelligence (AI) to real-life applications. The goal for this pilot study was to measure speech intelligibility in individuals with Parkinson’s Disease using automatic speech recognition (ASR) in noise. To this end we report the sensitivity of Google Cloud speech-to-text API, a prominent provider of ASR, to a specific type of background noise, multi-talker babble, which is commonly implemented in the study of dysarthria (Moya-Galé et al, 2018; Chiu and Neel, 2020)

  • Our goal was to determine the feasibility of implementing this service in the development of assistive technologies for people with PD, whose voice and speech difficulties may significantly decrease their intelligibility in noisy settings

Read more

Summary

Introduction

Parkinson’s disease is the second most common neurodegenerative disorder, following Alzheimer’s disease (Dorsey et al, 2007), with a prevalence of more than six million people worldwide (Dorsey et al, 2018). One of the hallmarks of PD is the presence of dysarthria, a motor speech disorder, characterized by a significant reduction in vocal loudness (i.e., hypophonia), monopitch, hoarse and breathy vocal quality, misarticulations of consonants and vowels, short rushes of speech, and variable rate (Duffy, 2020). These deviant features of healthy speech have a significant impact on speech intelligibility, which refers to how an acoustic signal is decoded by a listener (Kent et al, 1989). It is well known that ∼90% of individuals with PD are likely to develop voice and speech problems during the course of the disease (Logemann et al, 1978) and that more than half of these speakers experience problems with intelligibility (Miller et al, 2007)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call