Abstract
This study investigates signals from sustained phonation and text-dependent speech modalities for Parkinson’s disease screening. Phonation corresponds to the vowel /a/ voicing task and speech to the pronunciation of a short sentence in Lithuanian language. Signals were recorded through two channels simultaneously, namely, acoustic cardioid (AC) and smart phone (SP) microphones. Additional modalities were obtained by splitting speech recording into voiced and unvoiced parts. Information in each modality is summarized by 18 well-known audio feature sets. Random forest (RF) is used as a machine learning algorithm, both for individual feature sets and for decision-level fusion. Detection performance is measured by the out-of-bag equal error rate (EER) and the cost of log-likelihood-ratio. Essentia audio feature set was the best using the AC speech modality and YAAFE audio feature set was the best using the SP unvoiced modality, achieving EER of 20.30% and 25.57%, respectively. Fusion of all feature sets and modalities resulted in EER of 19.27% for the AC and 23.00% for the SP channel. Non-linear projection of a RF-based proximity matrix into the 2D space enriched medical decision support by visualization.
Highlights
Parkison’s disease (PD) is the second most common neurodegenerative disease after Alzheimer’s [1] and it is anticipated that the prevalence of PD is going to increase due to population ageing
The detection performance of individual feature sets was evaluated by estimating recordingbased cost of log-likelihood-ratio (Cllr) and equal error rate (EER) measures
Phonation is often outperformed by the speech, especially in the smart phone (SP) channel, where the exception to this tendency is shown only by 2 feature sets (# 1–2) according to Cllr or 5 feature sets (# 1–4, 8) according to EER
Summary
Parkison’s disease (PD) is the second most common neurodegenerative disease after Alzheimer’s [1] and it is anticipated that the prevalence of PD is going to increase due to population ageing. The loss of dopaminergic neurons can reach up to 50% at the time of clinical diagnosis [2] and rapidly increases completing by 4 years post-diagnosis [3]. Any neuroprotective strategies that may emerge in the near future could be too late to effectively slow down the neurodegenerative process. Early objective diagnostic markers are critically needed. PD manifests itself through speech disorders, which can be observed as early as 5 years before the diagnosis [4].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.