AbstractMachine learning (ML) has made some significant contributions to stroke prevention, but the stability and accuracy of existing models for clinical applications are uncertain. This study develops and validates an interpretable ML model using metabolic and coagulation biomarkers to predict ischemic stroke in elderly hypertensive patients in Northwest China. The prediction model used 453 electronic medical records for the model building (80% as a training set and 20% as a test set) and 132 for external validation. The final seven key features (D‐dimer, cystatin C, homocysteine, hemoglobin A1c, prothrombin time, low‐density lipoprotein C, and triglyceride glucose‐body mass index) were selected by the advanced approach, elastic net, and classical wrapping approaches. The final model, eXtreme gradient boosting, was identified as having superior performance than the other 9 classifers (random forest, Gaussian process, multilayer perceptron, logistic regression, support vector machine, K‐nearest neighbor, decision tree, Gaussian naive bayes, and ensemble model), with area under the receiver‐operating characteristic curves of 0.97 and 0.94 for the test and external validation sets, respectively. The final model demonstrates excellent stability, accuracy, and clinical usefulness through various metrics and decision curve analysis. Additionally, an online human–machine interface application has been developed for clinical practice to help early identification and intervention for ischemic stroke in elderly hypertensive patients.
Read full abstract