Abstract

Late-onset Alzheimer's disease (LOAD) is the most common type of dementia, but its pathogenesis remains unclear, and there is a lack of simple and convenient early diagnostic markers to predict the occurrence. Our study aimed to identify diagnostic candidate genes to predict LOAD by machine learning methods. Three publicly available datasets from the Gene Expression Omnibus (GEO) database containing peripheral blood gene expression data for LOAD, mild cognitive impairment (MCI), and controls (CN) were downloaded. Differential expression analysis, the least absolute shrinkage and selection operator (LASSO), and support vector machine recursive feature elimination (SVM-RFE) were used to identify LOAD diagnostic candidate genes. These candidate genes were then validated in the validation group and clinical samples, and a LOAD prediction model was established. LASSO and SVM-RFE analyses identified 3 mitochondria-related genes (MRGs) as candidate genes, including NDUFA1, NDUFS5, and NDUFB3. In the verification of 3 MRGs, the AUC values showed that NDUFA1, NDUFS5 had better predictability. We also verified the candidate MRGs in MCI groups, the AUC values showed good performance. We then used NDUFA1, NDUFS5 and age to build a LOAD diagnostic model and AUC was 0.723. Results of qRT-PCR experiments with clinical blood samples showed that the three candidate genes were expressed significantly lower in the LOAD and MCI groups when compared to CN. Two mitochondrial-related candidate genes, NDUFA1 and NDUFS5, were identified as diagnostic markers for LOAD and MCI. Combining these two candidate genes with age, a LOAD diagnostic prediction model was successfully constructed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call