BackgroundSpontaneous intracerebral hemorrhage (sICH) is associated with significant mortality and morbidity. Predicting the prognosis of patients with sICH remains an important issue, which significantly affects treatment decisions. Utilizing readily available clinical parameters to anticipate the unfavorable prognosis of sICH patients holds notable clinical significance. This study employs five machine learning algorithms to establish a practical platform for the prediction of short-term prognostic outcomes in individuals afflicted with sICH.MethodsWithin the framework of this retrospective analysis, the model underwent training utilizing data gleaned from 413 cases from the training center, with subsequent validation employing data from external validation center. Comprehensive clinical information, laboratory analysis results, and imaging features pertaining to sICH patients were harnessed as training features for machine learning. We developed and validated the model efficacy using all the selected features of the patients using five models: Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), XGboost and LightGBM, respectively. The process of Recursive Feature Elimination (RFE) was executed for optimal feature screening. An internal five-fold cross-validation was employed to pinpoint the most suitable hyperparameters for the model, while an external five-fold cross-validation was implemented to discern the machine learning model demonstrating the superior average performance. Finally, the machine learning model with the best average performance is selected as our final model while using it for external validation. Evaluation of the machine learning model’s performance was comprehensively conducted through the utilization of the ROC curve, accuracy, and other relevant indicators. The SHAP diagram was utilized to elucidate the variable importance within the model, culminating in the amalgamation of the above metrics to discern the most succinct features and establish a practical prognostic prediction platform.ResultsA total of 413 patients with sICH patients were collected in the training center, of which 180 were patients with poor prognosis. A total of 74 patients with sICH were collected in the external validation center, of which 26 were patients with poor prognosis. Within the training set, the test set AUC values for SVM, LR, RF, XGBoost, and LightGBM models were recorded as 0.87, 0.896, 0.916, 0.885, and 0.912, respectively. The best average performance of the machine learning models in the training set was the RF model (average AUC: 0.906 ± 0.029, P < 0.01). The model still maintains a good performance in the external validation center, with an AUC of 0.817 (95% CI 0.705–0.928). Pertaining to feature importance for short-term prognostic attributes of sICH patients, the NIHSS score reigned supreme, succeeded by AST, Age, white blood cell, and hematoma volume, among others. In culmination, guided by the RF model’s variable importance weight and the model's ROC curve insights, the NIHSS score, AST, Age, white blood cell, and hematoma volume were integrated to forge a short-term prognostic prediction platform tailored for sICH patients.ConclusionWe constructed a prediction model based on the results of the RF model incorporating five clinically accessible predictors with reliable predictive efficacy for the short-term prognosis of sICH patients. Meanwhile, the performance of the external validation set was also more stable, which can be used for accurate prediction of short-term prognosis of sICH patients.
Read full abstract