BackgroundPrecise evaluation of pathological complete response (pCR) is essential for determining the prognosis of patients with locally advanced rectal cancer (LARC) undergoing neoadjuvant chemoradiotherapy (NCRT) and can offer clues for the selection of subsequent treatment strategies. Most current predictive models for pCR focus primarily on pre-treatment factors, neglecting the dynamic systemic changes that occur during neoadjuvant chemoradiotherapy, and are constrained by low accuracy and lack of integrity. PurposeThis study devised a novel predictor of pCR using dynamic alterations in systemic inflammation-nutritional marker indexes (SINI) during neoadjuvant therapy and developed a machine-learning model to predict pCR. MethodsTwo cohorts of patients with LARC from center one from 2012 to 2017 and from center two from 2020 to 2023 were integrated for analysis. This study compared dynamic changes in blood indexes before and after neoadjuvant therapy and surgical operation. A least absolute shrinkage and selection operator (LASSO) regression analysis was conducted to mitigate collinearity and identify key indexes, constructing the SINI. Univariate and multiple logistic regression analyses were used to identify the independent risk factors associated with pCR. Additionally, 10 machine learning algorithms were employed to develop predictive models to assess risk. The hyperparameters of the machine learning models were optimized using a random search and 10-fold cross-validation. The models were assessed by examining various metrics, including the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, and the precision and accuracy of the internal and external validation cohorts. Additionally, Shapley's additive explanations (SHAP) were employed to interpret the machine learning models. ResultsThe study cohort comprised 677 patients from the center one and 224 patients from the center two. Six key indexes were identified, and a predictive index, SINI, was constructed. Univariate and multiple logistic regression analyses revealed that SINI, clinical T-stage, clinical N-stage, tumor size, and the distance from the anal verge were independent risk factors for pCR in patients with LARC following NCRT. The mean AUC value of the extreme gradient boosting (XGB) model in the 10-fold cross-validation of the training set was 0.877. The XGB model demonstrated superior performance in the internal and external validation sets. Specifically, in the internal test set, the XGB model achieved an AUC of 0.86, AUPRC of 0.707, accuracy of 0.82, and precision of 0.80. In the external validation set, the XGB model exhibited an AUC of 0.83, AUPRC of 0.702, accuracy of 0.81, and precision of 0.81. Additionally, the predictions generated by the XGB model were analyzed using SHAP. ConclusionThis study involved developing and validating an XGB model using SINI to predict pCR in patients with LARC. Besides, a SINI-based machine learning model shows promise in accurately predicting pCR following NCRT in patients with resectable LARC, offering valuable insights for personalized treatment approaches.