Abstract

<h3>Purpose/Objective(s)</h3> Following neoadjuvant chemoradiotherapy (nCRT), pathologic complete response (pCR) strongly influences the decision to proceed with surgery versus "watchful waiting" in rectal cancer (RC) patients. The purpose of this study is to predict pCR without using invasive procedures. An interpretable machine learning model trained with clinical and imaging data from diagnosis and treatment, with extracted radiomics (R) and dosiomics (D) features is utilized to gain insight into contributing factors. <h3>Materials/Methods</h3> This study used multi-institutional datasets, including a training set of 180 patients from our institution and an independent test set of 37 patients from the RTOG 0822 clinical trial. Each patient had a radiotherapy planning CT and the associated contours of the gross tumor volume (GTV) and the organ-at-risks (OARs) including the bladder, bowel_samll, and femur_heads. A total of 296 features including clinical parameters (CP), GTV and OAR dose-volume histogram (DVH), GTV R, and GTV D features were extracted. R and D features were subcategorized into the first- (L1), second- (L2), and higher-order (L3) local texture features. Multiview input data analysis was performed to identify an optimal set of input feature categories by using an exhaustive search. For each input, feature selection was performed using Boruta, followed by collinearity removal based on the variance inflation factor. Explainable boosting machine (EBM), an interpretable glass-box model, was trained using selected features. The performance of EBM on the test set was evaluated using the area under the receiver operating characteristic curve (AUC) and compared with that of 3 state-of-the-art black-box models: extreme gradient boosting (XGB), random forest (RF), and support vector machine (SVM). <h3>Results</h3> Selected features included two shape (elongation, maximum2DDiameterColumn) and one first-order (variance) features in R. Global explanations of EBM showed that tumors with maximum2DDiameterColumn <40 or >90 mm, elongation <0.55, and higher variance of CT intensities were associated with unfavorable outcomes. For all 4 models, the best predictive performance was obtained with an optimal input feature subset of CP+DVH+L1<sub>R</sub>+L1<sub>D</sub>, which significantly outperformed a full feature set of CP+DVH+R<sub>all</sub>+D<sub>all</sub> (AUC=0.602 to 0.618). The EBM had the best performance for predicting pCR (AUC=0.855), followed by RF, SVM, and XGB (AUC=0.839, 0.822, and 0.809). <h3>Conclusion</h3> EBM has the potential to improve the predictability of pCR while also enhancing model interpretability, which can aid in radiotherapy decision-making. Using an optimal input feature set selected by multi-view input analysis can lead to improved performance compared to concatenating all available features, which may, in turn, have implications in patient selection for a "watchful-waiting" approach in RC management.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call