Abstract

Objective: To establish a stroke prediction and feature analysis model integrating XGBoost and SHAP to aid the clinical diagnosis and prevention of stroke. Methods: Based on the open data set on Kaggle, with the help of data preprocessing and grid parameter optimization, an interpretable stroke risk prediction model was established by integrating XGBoost and SHAP and an explanatory analysis of risk factors was performed. Results: The XGBoost model’s accuracy, sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were 96.71%, 93.83%, 99.59%, and 99.19%, respectively. Our explanatory analysis showed that age, type of residence, and history of hypertension were key factors affecting the incidence of stroke. Conclusion: Based on the data set, our analysis showed that the established model can be used to identify stroke, and our explanatory analysis based on SHAP increases the transparency of the model and facilitates medical practitioners to analyze the reliability of the model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call