Abstract
The recently proposed Parkinson’s Disease (PD) telediagnosis systems based on detecting dysphonia achieve very high classification rates in discriminating healthy subjects from PD patients. However, in these studies the data used to construct the classification model contain the speech recordings of both early and late PD patients with different severities of speech impairments resulting in unrealistic results. In a more realistic scenario, an early telediagnosis system is expected to be used in suspicious cases by healthy subjects or early PD patients with mild speech impairment. In this paper, considering the critical importance of early diagnosis in the treatment of the disease, we evaluate the ability of vocal features in early telediagnosis of Parkinson's Disease (PD) using machine learning techniques with a two-step approach. In the first step, using only patient data, we aim to determine the patient group with relatively greater severity of speech impairments using Unified Parkinson’s Disease Rating Scale (UPDRS) score as an index of disease progression. For this purpose, we use three supervised and two unsupervised learning techniques. In the second step, we exclude the samples of this group of patients from the dataset, create a new dataset consisting of the samples of PD patients having less severity of speech impairments and healthy subjects, and use three classifiers with various settings to address this binary classification problem. In this classification problem, the highest accuracy of 96.4% and Matthew’s Correlation Coefficient of 0.77 is obtained using support vector machines with third-degree polynomial kernel showing that vocal features can be used to build a decision support system for early telediagnosis of PD.
Highlights
In [28], using disease duration as an index of disease progression, the association between disease duration and various Unified Parkinson’s Disease Rating Scale (UPDRS) subscores is examined, and the findings revealed that activities of daily living (ADL) subscore and motor subscore, each including a speech part, are strongly associated with disease duration
We present the results in terms of accuracy and Matthew’s Correlation Coefficient (MCC) evaluation metrics, since the binary classification problem obtained according to the determined UPDRS threshold value may result in imbalanced datasets in which sample from one class is in higher number than other, we take the MCC metric into account to determine the maximally predictable UPDRS threshold value
The features are fed into Support Vector Machines (SVM), Extreme Learning Machines (ELM) and k-nearest neighbors (k-NN) classifiers for various motor UPDRS threshold values
Summary
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Parkinson’s disease (PD) is one of the most frequently seen neurodegenerative disorders affecting the human central, peripheral, and enteric nervous systems [1]. In a recent study that synthesized studies on the prevalence of PD, meta-analysis of the worldwide data showed that PD prevalence increases steadily with age from 41/100000 in 40 to 49 years to 1903/100000 in older than 80 years [2]. The standardized incidences reported in previous studies ranged from 16 to 19 per 100000 per year [3]. Many studies have reported that PD incidence rises
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.