Introduction: Accurate prediction of outcome destination at an early stage would help manage patients presenting with stroke. This study assessed the predictive ability of three machine learning (ML) algorithms to predict outcomes at four different stages as well as compared the predictive power of stroke scores.Methods: Patients presenting with acute stroke to the Canberra Hospital between 2015 and 2019 were selected retrospectively. 16 potential predictors and one target variable (discharge destination) were obtained from the notes. k-Nearest Neighbour (kNN) and two ensemble-based classification algorithms (Adaptive Boosting and Bootstrap Aggregation) were employed to predict outcomes. Predictive accuracy was assessed at each of the four stages using both overall and per-class accuracy. The contribution of each variable to the prediction outcome was evaluated by the ensemble-based algorithm and using the Relief feature selection algorithm. Various combinations of stroke scores were tested using the aforementioned models.Results: Of the three ML models, Adaptive Boosting demonstrated the highest accuracy (90%) at Stage 4 in predicting death while the highest overall accuracy (81.7%) was achieved by kNN (k=2/City-block distance). Feature importance analysis has shown that the most important features are the 24-hour Scandinavian Stroke Scale (SSS) and 24-hour National Institutes of Health Stroke Scale (NIHSS) scores, dyslipidaemia, hypertension and premorbid mRS score. For the initial and 24-hour scores, there was a higher correlation (0.93) between SSS scores than for NIHSS scores (0.81). Reducing the overall four scores to InitSSS/24hrNIHSS increased accuracy to 95% in predicting death (Adaptive Boosting) and overall accuracy to 85.4% (kNN). Accuracies at Stage 2 (pre-treatment, 11 predictors) were not far behind those at Stage 4.Conclusion: Our findings suggest that even in the early stages of management, a clinically useful prediction regarding discharge destination can be made. Adaptive Boosting might be the best ML model, especially when it comes to predicting death. The predictors’ importance analysis also showed that dyslipidemia and hypertension contributed to the discharge outcome even more than expected. Further, surprisingly using mixed score systems might also lead to higher prediction accuracies.
Read full abstract