Integrative gene expression analysis for the diagnosis of Parkinson’s disease using machine learning and explainable AI

Nikita Bhandari,Rahee Walambe,Ketan Kotecha,Mehul Kaliya

doi:10.1016/j.compbiomed.2023.107140

Abstract

Parkinson's disease (PD) is a progressive neurodegenerative disorder. Various symptoms and diagnostic tests are used in combination for the diagnosis of PD; however, accurate diagnosis at early stages is difficult. Blood-based markers can support physicians in the early diagnosis and treatment of PD. In this study, we used Machine Learning (ML) based methods for the diagnosis of PD by integrating gene expression data from different sources and applying explainable artificial intelligence (XAI) techniques to find the significant set of gene features contributing to diagnosis. We utilized the Least Absolute Shrinkage and Selection Operator (LASSO), and Ridge regression for the feature selection process. We utilized state-of-the-art ML techniques for the classification of PD cases and healthy controls. Logistic regression and Support Vector Machine showed the highest diagnostic accuracy. SHapley Additive exPlanations (SHAP) based global interpretable model-agnostic XAI method was utilized for the interpretation of the Support Vector Machine model. A set of significant biomarkers that contributed to the diagnosis of PD were identified. Some of these genes are associated with other neurodegenerative diseases. Our results suggest that the utilization of XAI can be useful in making early therapeutic decisions for the treatment of PD. The integration of datasets from different sources made this model robust. We believe that this research article will be of interest to clinicians as well as computational biologists in translational research.

Full Text