Abstract

AbstractBackgroundEarly identification of dementia is crucial for prompt intervention and better outcomes for high‐risk individuals in the general population. The use of machine learning in dementia prediction has allowed for highly accurate models that could aid in early classification; however, they have focused on expensive predictors. Exploring predictors that are more accessible is crucial for the possible widespread use of machine learning models in clinical practice.MethodData from 4,793 individuals without dementia or mild cognitive impairment at baseline were included from the population‐based AGES‐Reykjavik Study. Cognitive, biometric, and MRI assessments (total: 64 variables) were collected at baseline, with follow‐up of incident dementia diagnoses for a maximum of 12 years. Elastic net regression, random forest, support vector machine, Naïve Bayes, logistic regression, and elastic net Cox regression (for time‐to‐event analysis) were explored as possible algorithms. Model 1 was fit using all variables, model 2 after feature selection using the Boruta package, and model 3 without neuroimaging markers (clinically accessible model). Ten‐fold cross‐validation, repeated ten times, was implemented during training. Upsampling was used to account for imbalanced data. Tuning parameters were optimized for recalibration automatically using the caret package.ResultDuring training, the Model 2 elastic net regression had the highest AUC [0.78; 95% CI: 0.76‐0.80], sensitivity [72; 95% CI: 68‐75], and specificity [72; 95% CI: 70‐73]. For Model 3, excluding MRI markers, the AUC remained high [AUC 0.75; 95% CI: 0.73‐0.77]. Similar results were found in our test data for Model 3 [AUC 0.74; 95% CI: 0.71‐0.77]; thus, the risk of overfitting was low. Time‐to‐event analysis using elastic net Cox models showed similar discrimination [c‐statistic 0.79] during testing. The most important variables based on Boruta selection included the Activities of Daily Living, presence of APOE e4 allele, memory functioning, and sex.ConclusionSupervised machine learning could be used to identify individuals at high‐risk for dementia in the general population using easily accessible variables. As dementia becomes a leading problem in developing countries, this clinically accessible model could be explored for use in these areas for better identification of individuals at risk in the community. Further external validation is needed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call