Abstract Objective To assess the feasibility of utilizing machine learning (ML) of neuropsychological assessment data in determining the severity of frontotemporal dementia (FTD) as classified by the Clinical Dementia Rating (CDR). Method A Random Forest classification model was employed using a deidentified FTD dataset. Participants’ (N = 3023) were stratified based on CDR scores for severity: Normal (n = 1646, Mage = 46.95, 59.66% female), Mild (n = 855, Mage = 63.06, 57.08% female), Moderate–Severe (n = 522, Mage = 64.57, 62.45% female). Participants were than randomly stratified into a training set (80%) and a testing set (20%) for model evaluation. The model was then trained on the training set using participant data including age, gender, verbal fluency, digit span forward, digit span backward, trails A, trails B, and the MOCA total score. The model was then evaluated on the testing set. Results The model achieved an accuracy of 74.55% on the testing set. Detailed performance metrics revealed a precision of 86%, recall of 91%, and F1-score of 89% for the normal category, 56% precision, 56% recall, and 56% F1-score for mild, and 65% precision, 52% recall, and 58% F1-score for moderate–severe. The most salient features were verbal fluency and age, with assigned importance scores of 0.174 and 0.150, respectively. Conclusions Although the Random Forest model demonstrated potential in predicting FTD severity from neuropsychological data, its moderate performance for mild and severe categories indicates it’s not yet clinically viable. This study offers a foundational framework, suggesting further model refinement and larger datasets are necessary to enhance accuracy and clinical utility. Future research should focus on including diverse variables and populations.
Read full abstract