ObjectivesThe aim of this study was to develop machine learning-based models for predicting acute cerebral infarction (ACI) in patients.MethodsWe extracted the data of ACI patients and non-ACI patients (as control) from two hospitals. The Lasso algorithm was employed to select the most crucial features associated with ACI. Five machine learning algorithms-based models were trained, which was performed with 10-fold cross-validation. Then, the area under the receiver operating characteristic curve (AUC), accuracy, and F1-score were calculated in the training models. Accordingly, the training models with excellent performance was selected as the final predictive model. The relative importance of variables was analyzed and ranked.ResultsA total of 150 patients were diagnosed with ACI (50.00%), with a higher proportion of males (70.67% vs. 44.00%) compared to the non-ACI patients. The logistic regression model exhibited a good performance in predicting ACI in the training set, as evidenced by its highest AUC, accuracy, sensitivity, and F1-score. Furthermore, feature importance analysis showed that blood glucose, gender, smoking history, serum homocysteine, folic acid, and C-reactive protein were the top six crucial variables of the logistic regression.ConclusionsIn our work, the ACI risk prediction model developed by the logistic regression exhibited excellent performance. This could contribute to the identification of risk variables for ACI patients and enables clinicians timely and effective interventions.
Read full abstract