The objective of this study is to develop accurate machine learning (ML) models for predicting the neurological status at hospital discharge of critically ill patients with hemorrhagic and ischemic stroke and identify the risk factors associated with the neurological outcome of stroke, thereby providing healthcare professionals with enhanced clinical decision-making guidance. Data of stroke patients were extracted from the eICU Collaborative Research Database (eICU-CRD) for training and testing sets and the Medical Information Mart for Intensive Care IV (MIMIC IV) database for external validation. Four machine learning models, namely gradient boosting classifier (GBC), logistic regression (LR), multi-layer perceptron (MLP), and random forest (RF), were used for prediction of neurological outcome. Furthermore, shapley additive explanations (SHAP) algorithm was applied to explain models visually. A total of 1,216 hemorrhagic stroke patients and 954 ischemic stroke patients from eICU-CRD and 921 hemorrhagic stroke patients 902 ischemic stroke patients from MIMIC IV were included in this study. In the hemorrhagic stroke cohort, the LR model achieved the highest area under curve (AUC) of 0.887 in the test cohort, while in the ischemic stroke cohort, the RF model demonstrated the best performance with an AUC of 0.867 in the test cohort. Further analysis of risk factors was conducted using SHAP analysis and the results of this study were converted into an online prediction tool. ML models are reliable tools for predicting hemorrhagic and ischemic stroke neurological outcome and have the potential to improve critical care of stroke patients. The summarized risk factors obtained from SHAP enable a more nuanced understanding of the reasoning behind prediction outcomes and the optimization of the treatment strategy.
Read full abstract