Abstract

<h3>Purpose/Objective(s)</h3> Radiation pneumonitis (RP) is a significant cause of toxicity in patients receiving radiation therapy (RT) for lung cancer. Machine learning techniques have been utilized to develop models to predict RP, but prior efforts have largely focused on conventionally fractioned RT. The current study's aim is to add to the growing body of literature using machine learning to identify and verify important features related to RP post-RT and to generate an algorithm capable of predicting this adverse outcome. Additionally, this study is differentiated from prior efforts by focusing on SBRT rather than conventionally fractionated RT and using a combination of accessible dosimetric and clinical features, with the goal of creating a clinically useful model to predict RP risk. <h3>Materials/Methods</h3> The study utilized data from 201 lung cancer patients that were treated with SBRT between 2005 and 2015. Patient data including demographics, tumor characteristics, and dosimetric features were analyzed for association with symptomatic RP, defined as CTCAE v. 4.0 ≥ Grade 2. Prior to data modeling the chi-square test was used to rank important features. Data imbalance was corrected for using the Synthetic Minority Oversampling Technique (SMOTE), and commercial software was used to generate classification machine learning models based on the balanced dataset. Models were tested on the original dataset for classification accuracy, area under the curve (AUC), sensitivity, and specificity. The performance of several different algorithms was evaluated including Decision Trees, Discriminant Analysis, Logistic Regression, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers and Ensemble Classifiers. Each model utilized 10-fold cross-validation to prevent overfitting. <h3>Results</h3> Out of 201 patients receiving SBRT, 24 patients (11.9%) developed symptomatic RP. Based on chi-square test results, increased lung V12.5 (volume receiving 12.5 Gy), multiple tumor sites, prior RT to the thorax or lung surgery, right-sided tumor laterality, and former smoking status were selected as predictors in the machine learning models. After testing, Ensemble classifiers were found to be the most accurate overall. Of the Ensemble models, a Bagged Trees classifier had the most impressive results, correctly classifying 23 out of 24 cases of symptomatic RP. The model had an AUC of 0.93, overall accuracy of 91.5%, sensitivity of 95.8% and specificity of 91.0%. <h3>Conclusion</h3> A machine learning algorithm was developed to predict RP, utilizing a combination of 5 ranked dosimetric and clinical features. The model has an overall accuracy of 91.5% and an AUC of 0.93. The advantage of this model when compared to others is that it was generated using SBRT patient data, as opposed to standard fractionation RT, and easily accessible clinical and tumor characteristics. Future research will test the model's accuracy on an independent dataset and executing prospective studies to verify the model's clinical utility.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call