Objective: To construct an ensemble machine learning model for predicting the occurrence of clinically relevant postoperative pancreatic fistula (CR-POPF) after pancreaticoduodenectomy and evaluate its application value. Methods: This is a research on predictive model. Clinical data of 421 patients undergoing pancreaticoduodenectomy in the Department of Pancreatic Surgery,Union Hospital, Tongji Medical College,Huazhong University of Science and Technology from June 2020 to May 2023 were retrospectively collected. There were 241 males (57.2%) and 180 females (42.8%) with an age of (59.7±11.0)years (range: 12 to 85 years).The research objects were divided into training set (315 cases) and test set (106 cases) by stratified random sampling in the ratio of 3∶1. Recursive feature elimination is used to screen features,nine machine learning algorithms are used to model,three groups of models with better fitting ability are selected,and the ensemble model was constructed by Stacking algorithm for model fusion. The model performance was evaluated by various indexes,and the interpretability of the optimal model was analyzed by Shapley Additive Explanations(SHAP) method. The patients in the test set were divided into different risk groups according to the prediction probability (P) of the alternative pancreatic fistula risk score system (a-FRS). The a-FRS score was validated and the predictive efficacy of the model was compared. Results: Among 421 patients,CR-POPF occurred in 84 cases (20.0%). In the test set,the Stacking ensemble model performs best,with the area under the curve (AUC) of the subject's work characteristic curve being 0.823,the accuracy being 0.83,the F1 score being 0.63,and the Brier score being 0.097. SHAP summary map showed that the top 9 factors affecting CR-POPF after pancreaticoduodenectomy were pancreatic duct diameter,CT value ratio,postoperative serum amylase,IL-6,body mass index,operative time,albumin difference before and after surgery,procalcitonin and IL-10. The effects of each feature on the occurrence of CR-POPF after pancreaticoduodenectomy showed a complex nonlinear relationship. The risk of CR-POPF increased when pancreatic duct diameter<3.5 mm,CT value ratio<0.95,postoperative serum amylase concentration>150 U/L,IL-6 level>280 ng/L,operative time>350 minutes,and albumin decreased by more than 10 g/L. The AUC of a-FRS in the test set was 0.668,and the prediction performance of a-FRS was lower than that of the Stacking ensemble machine learning model. Conclusion: The ensemble machine learning model constructed in this study can predict the occurrence of CR-POPF after pancreaticoduodenectomy,and has the potential to be a tool for personalized diagnosis and treatment after pancreaticoduodenectomy.
Read full abstract