Current strategies for localized rectal cancer include multimodal approach with neoadjuvant treatments followed by surgery. Some patients may present complete response after chemoradiation (CRT) and possible candidates to nonoperative management. We aim to detect complete-responders only based on the pretreatment radiological information using radiomics. We retrospectively studied 44 randomly selected rectal cancer patients who underwent an MR study for rectal cancer staging and, subsequently, were treated with CRT. Median age was 66 y and 59% were females. Tumors were located in the proximal rectum in 11p, 25p in the middle and 8p in the distal. The initial TNM classification, 38p had T3 stage, 19p N1 and 17p N2. Two patients did not receive chemotherapy. Median dose: 50.4Gy (48-54Gy). One radiologist and one radiation oncologist with expertise in MR manually contoured the primary tumor using Phillips IntelliSpace Portal software. We computed a total of 1688 radiomic features from the T2w images using the pyRadiomics package. We compared several machine learning (ML) classification algorithms (linear regression (LR), gaussian processes (GP), multi-layer perceptron (MLP), support vector machines (SVM), and random forest (RF)), as well as several combinations of classifiers using the scikit-learn python module. We used a stratified 5-fold cross validation approach to assess them. The performance was assessed by calculating the average of the area under the curve (AUC) from receiver operating characteristic (ROC) curve analysis for each one of the k splits generated during cross-validation. Mean accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score were also computed. Pathological TNM (ypTNM): 22p had ypT0N0, 9p ypT3-4, 10p ypT2 and 9p ypN1. Tumor volume between responders and non-responders were: 26.10±16.24cc vs. 22.78±34.49cc (p = 0.68). Predictions from the classifiers showed good classification capabilities identifying ypT0N0 (i.e., AUC≥0.73±0.17) for all classifiers; however, combined classifiers demonstrated superior capabilities compared to stand-alone classifiers. Specifically, the RF-MLP, RF-GP-SVM, RF-GP-MLP and RF-GP-LR combinations showed the best results with a mean accuracy, sensitivity, specificity, PPV, NPV, and F1-score of 0.80±0.13, 0.90±0.20, 0.72±0.10, 0.71±0.12, 0.92±0.16, 0.79±0.15, with slight differences for AUC CONCLUSION: We trained a series of ML algorithms based on T2w MR imaging radiomic features and evaluated their ability to distinguish responders from non-responders. MR data have a richness of information that allows identifying the response to CRT treatment in a predictive context, especially when using combined classifiers. A validation prospective set to detect ypT0N0 is warranted. Clinical utility should be evaluated.
Read full abstract