Explainable machine learning models based on multimodal time-series data for the early detection of Parkinson’s disease

Muhammad Junaid,Sajid Ali,Fatma Eid,Shaker El-Sappagh,Tamer Abuhmed

doi:10.1016/j.cmpb.2023.107495

Abstract

Background and objectivesParkinson’s Disease (PD) is a devastating chronic neurological condition. Machine learning (ML) techniques have been used in the early prediction of PD progression. Fusion of heterogeneous data modalities proved its capability to improve the performance of ML models. Time series data fusion supports the tracking of the disease over time. In addition, the trustworthiness of the resulting models is improved by adding model explainability features. The literature on PD has not sufficiently explored these three points. MethodsIn this work, we proposed an ML pipeline for predicting the progression of PD that is both accurate and explainable. We explore the fusion of different combinations of five time series modalities from the Parkinson’s Progression Markers Initiative (PPMI) real-world dataset, including patient characteristics, biosamples, medication history, motor, and non-motor function data. Each patient has six visits. The problem has been formulated in two ways: ❶ a three-class based progression prediction with 953 patients in each time series modality, and ❷ a four-class based progression prediction with 1,060 patients in each time series modality. The statistical features of these six visits were calculated from each modality and diverse feature selection methods were applied to select the most informative feature sets. The extracted features were used to train a set of well-known ML models including Support vector machines (SVM), random forests (RF), extra tree classifier (ETC), light gradient boosting machines (LGBM), and stochastic gradient descent (SGD). We examined a number of data-balancing strategies in the pipeline with different combinations of modalities. ML models have been optimized using the Bayesian optimizer. A comprehensive evaluation of various ML methods has been conducted, and the best models have been extended to provide different explainability features. ResultsWe compare the performance of ML models before and after optimization and using and without using feature selection. In the three-class experiment and with various modality fusions, the LGBM model produced the most accurate results with a 10-fold cross-validation (10-CV) accuracy of 90.73% using non-motor function modality. RF produced the best results in the four-class experiment with various modality fusions with a 10-CV accuracy of 94.57% using non-motor modality. With the fused dataset of non-motor and motor function modalities, the LGBM model outperformed the other ML models in both the 3-class and 4-class experiments (i.e., 10-CV accuracy of 94.89% and 93.73%, respectively). Using the Shapely Additive Explanations (SHAP) framework, we employed global and instance-based explanations to explain the behavior of each ML classifier. Moreover, we extended the explainability by implementing the LIME and SHAPASH local explainers. The consistency of these explainers has been explored. The resultant classifiers were accurate, explainable, and thus medically more relevant and applicable. ConclusionsThe select modalities and feature sets were confirmed by the literature and medical experts. The various explainers suggest that the bradykinesia (NP3BRADY) feature was the most dominant and consistent. By providing thorough insights into the influence of multiple modalities on the disease risk, the suggested approach is expected to help improve the clinical knowledge of PD progression processes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Explainable machine learning models based on multimodal time-series data for the early detection of Parkinson’s disease

Abstract

Talk to us

Similar Papers

More From: Computer Methods and Programs in Biomedicine

Lead the way for us

Journal: Computer Methods and Programs in Biomedicine	Publication Date: Mar 23, 2023
Citations: 30

Similar Papers

Does Artificial Intelligence Outperform Natural Intelligence in Interpreting Musculoskeletal Radiological Studies? A Systematic Review.
Olivier Q Groot ... Michiel E R Bongers
Clinical Orthopaedics & Related Research | VOL. 478
Olivier Q Groot, et. al.Olivier Q Groot ... Michiel E R Bongers
30 Jul 2020
Clinical Orthopaedics & Related Research | VOL. 478

Advanced tree-based machine learning methods for predicting the seismic response of regular and irregular RC frames
Ahmet Demir ... Selçuk Demir
Structures | VOL. 64
Ahmet Demir, et. al.Ahmet Demir ... Selçuk Demir
11 May 2024
Structures | VOL. 64

Machine Learning Models for Blood Glucose Level Prediction in Patients With Diabetes Mellitus: Systematic Review and Network Meta-Analysis.
Kui Liu ... Changsheng Chen
JMIR Medical Informatics | VOL. 11
Kui Liu, et. al.Kui Liu ... Changsheng Chen
20 Nov 2023
JMIR Medical Informatics | VOL. 11

Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization
Jia Wu ... Si-Hao Deng
Journal of Electronic Science and Technology | VOL. 17
Jia Wu, et. al.Jia Wu ... Si-Hao Deng
11 Dec 2019
Journal of Electronic Science and Technology | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Explainable machine learning models based on multimodal time-series data for the early detection of Parkinson’s disease

Abstract

Talk to us

Similar Papers

More From: Computer Methods and Programs in Biomedicine