Abstract

The fact that ensemble methods enhance the prediction performance. Therefore, we focused on developing a weighted ensemble method using a novel combination of Cerebrospinal Fluid (CSF) protein biomarkers to predict AD's earlier stages with greater accuracy than the state-of-the-art CSF protein biomarkers. In this regard, two feature selection methods, namely the Recursive Feature Elimination (RFE) and L1 regularization method were used to screen the most important subset of features for building a classification model using the Mild Cognitive Impairment (MCI) dataset. A novel combination of three biomarkers, namely Cystatin C, Matrix metalloproteinases (MMP10), and tau protein, was screened using the linear Support Vector Machine (SVM) and Logistic Regression (LR) classifier based RFE method. Two-tailed unpaired t-test analysis at a 5% significance level showed a significant difference between the mean levels of Cystatin C, MMP10, and tau protein between cognitive normal and cognitively impaired groups. An ensemble model using a weighted average of two best performing classifiers (LR and Linear SVM) was created using a novel subset of three most informative features. Our ensemble model's weighted average results performed significantly better than LR and Linear SVM base classifiers' performance. The Receiver Operating Characteristic Curve (ROC_AUC) and Area under Precision-Recall values (AUPR) of our proposed model were observed to be 0.9799 ± 0.055 0.9108 ± 0.015, respectively. The performance of our proposed weighted averaged ensemble model built using a novel combination of CSF protein biomarkers was significantly better (p <; 0.001) than models generated using different combinations of CSF protein biomarkers obtained from recent studies. An ensemble-learning based application was implemented and deployed at Heroku at https://appsalzheimer.herokuapp.com.

Highlights

  • Alzheimer's disease results in a neurodegenerative disorder that causes irreversible and progressive brain cell damage, usually affecting people during their mid-60s [1,2]

  • We have focused on the Recursive Feature Elimination (RFE), an important example of a wrapper based feature selection process and L1 regularization with an L1 penalty to screen the most important subset of features, an example of an embedded based feature selection method

  • The Comparative performance evaluation of different sets of features listed in Table I was trained and tested on four different cost-sensitive classifiers (RF, Logistic Regression (LR), LibSVM, and DECISION TREE (DT)), are shown in Fig. 8 (a-d)

Read more

Summary

Introduction

Alzheimer's disease results in a neurodegenerative disorder that causes irreversible and progressive brain cell damage, usually affecting people during their mid-60s [1,2]. Preclinical changes in the brain associated with Alzheimer's begin years before the onset of the disease's typical clinical symptoms. Though the onset of AD cannot be reversed or stopped, early detection of the disease can allow treatment and spontaneous care of Alzheimer's patients in their earlier stages before irreparable damage to the brain has occurred [3,4]. The biochemical changes in CSF associated with AD's progression provide a sound and potential source of diagnostic biomarkers to study the disease's preclinical and clinical stages. Conventional CSF biomarkers, namely tau, amyloid-β42 (Aβ42), and phosphorylated forms of tau (p-tau), have shown more significant potential in the screening of MCI patients who eventually progressed to clinically diagnosable AD [15,16,17,18,19]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call