Abstract

Radiation-induced esophagitis (RE) is an observed side effect of radiation therapy, can be dose limiting, and is associated with worse overall survival. Currently, few factors beyond esophageal radiation dose are known to be associated with RE. We hypothesize that the use of machine learning (ML) tools can identify factors associated with the development of RE. In this study, we used a dataset from a cohort of 203 consecutive stage II-III locally advanced NSCLC patients, of which 11.3% developed grade≥3 RE. Patients were treated between 2008 and 2016 with IMRT or proton therapy to a median dose of 66.6/1.8 Gy (range 60-80 Gy in 1.8-2.5 Gy fractions). We evaluated 32 continuous and categorical features per patient grouped into risk factors, comorbidities, pretreatment imaging, stage, histology, radiation treatment, chemotherapy, and dosimetry. Univariate analysis was performed using optimally trained decision stumps to determine statistically significant features and their corresponding RE thresholds. Multivariate analysis was also carried out using sequential forward floating selection (SFFS) with decision trees for feature selection. To assess the combined capacity of prediction of the features, a sampling/boosting approach known as RUSBoost was used, which allows the training of strong classifiers from skewed datasets. Balanced cross-validation was used to identify optimal series and thresholds of discriminative variables that map the feature values to the possible presence of RE. The area under the receiver operating characteristic curve (AUC) was used to measure the combined classification performance of the selected features. Univariate analysis showed that esophagus maximum dose >65.1 Gy (p=0.001), lung V20>28% (p=0.02) and heart V5>34% (p=0.04) could consistently predict the presence of RE and are significantly correlated (p<0.05). Moreover, the heart mean, lung mean, and total delivered doses, as well as the T stage, agents/drugs, heart V60, and pre-FEV1 were statistically significant (all p<0.05), while age at diagnosis and esophagus mean dose were marginally significant (both p=0.05) to predict RE. On multivariate analysis, the feature selection using SFFS chose the esophagus maximum dose in 45 and the heart mean dose in 53 of 100 experiment trials. The AUC of the combined classification performance of the RUSBoost ensemble with 250 trees was 0.62 for a 95% confidence interval with confidence range 0.52-0.72. This is the largest report to date of ML applied to predict RE. We found that ML allows for the identification of features and thresholds predictive of RE. These findings can inform treatment planning for radiation oncologists and may ultimately allow for personalized treatment delivery and a reduction in patient morbidity. Prospective data validating these findings are needed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call