Abstract

This study focuses on the possibility of remote monitoring and screening of Parkinson’s and age-related voice impairment for the general public using self-recorded data on readily available or emerging technologies such as Smartphone and IoT devices. While most studies use professionally recorded voice in a controlled environment, this study uses self-recorded sustained vowel /a/ recordings using iPhone. Each healthy control (HC) and people with Parkinson’s (PWP) group has 57 age-matching mixed-gender subjects. The control subjects can have age-related voice impairment. Without severity labels, features extracted from the recordings were grouped by their similarity in voice using unsupervised learning with various clustering methods. The optimal number of clusters ([Formula: see text]) was estimated using direct and statistical methods. The estimated [Formula: see text] does not agree with the defined Unified Parkinson’s Disease Rating Scale-Speech (UPDRS-3.1) scales. Using [Formula: see text], five hierarchical and one partition-based clustering were used for comparison and cross-checking. The hierarchical-based methods are Hierarchical Cluster (HCluster), Hierarchical K-Means (HKMeans), Agglomerative Nesting (AGNES), Divisive Analysis (DIANA), and neural network-based Self-Organized Tree Algorithm (SOTA). The partition-based method is Clustering Large Applications (CLARA). Three internal validation indices: connectivity, Dunn index and silhouette width, were used to measure the compactness of the clusters and their separations. The validation result, ordered from the best, is AGNES, HCluster, DIANA, HKMeans, CLARA, and SOTA. Majority vote was applied to the results from AGNES, HCluster and DIANA to obtain the final grouping. Five groups were defined representing outliers, severely impaired voice, minor impaired, healthier voice, and cannot be grouped. All methods identified the same two outliers except SOTA. The clustering and voting have successfully identified the 2 outliers, 5 more severely impaired, 82 minor impaired, and 22 healthier voice. Only 3 could not be grouped. Feature extraction has reduced the data size by a factor of 518. It is possible to first reduce the data size for transmission and perform unsupervised learning at the receiving end for remote monitoring and screening.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call