Abstract

Surgery and radiotherapy (RT) have equivalent tumor control outcomes for most categories of localized prostate cancer. Consequently, one of the primary driving factors in patient decision making for curative treatment in prostate cancer is expected treatment toxicity in domains such as genitourinary (GU) and gastrointestinal (GI). However, there is a lack of robust prognostic models for toxicity prediction. Both clinical/dosimetric features and genomics likely contribute to toxicity risk. Our work ultimately aims to predict prostate RT toxicity via machine learning applied to both clinical/dosimetric and genomic domains. Here, we describe using clinical/dosimetric variables to train random forest (RF) and logistic regression (LR) models to predict prostate RT toxicity. We use data from 1998 patients treated on the “Conventional versus Hypofractionated High-dose intensity-modulated radiotherapy for Prostate cancer” (CHHiP) trial. GU and GI toxicities were collected using three scales: RTOG, Royal Marsden Hospital, and Late Effects of Normal Tissue-Subjective Objective Management (LENT-SOM). Toxicity harmonization to “cystitis, noninfective” for GU or “proctitis” for GI was performed via Common Terminology Criteria for Adverse Events (CTCAE) v5.0. The outcome to be predicted was the probability of Grade 2+ toxicity at 2 years post-RT. The pruned input predictor variables included organ-at-risk dosimetric measures, prescription dose, demographics, and co-morbidities. Python 3.7.1 and R 3.5.1 were used. LR is a generalized linear model with a logit link function. RF is an ensemble decision tree method that uses bootstrapping and splits on subsets of variables selected at random. RF was tuned with grid search using 5-fold cross validation and LR was tuned using backward stepwise selection. Performance is presented as area-under-the-receiver-operating-curve (AUC) on 20% of the data held out for testing. Missing data was imputed using a median approach. The RF model predicted proctitis with AUC 0.570 (top 5 variable importance: age, %rectum receiving 50, 60, 65 and 70 Gy) and cystitis with AUC 0.525 (top 5: age, %rectum receiving 50 and 60 Gy, %bladder receiving 50 and 60 Gy). The LR model predicted proctitis with AUC 0.650 (top 5 statistically significant variables: inflammatory bowel disease, %rectum receiving 30 and 40 Gy, outlines modified, image guidance: gold seeds) and cystitis with AUC 0.516 (top 5: %bladder receiving 74 Gy, diabetes, %urethral bulb receiving 50 Gy, hypertension, European ethnicity). Both LR and RF approaches using clinical/dosimetric variables alone achieved only modest performance despite a large dataset, curation of input predictors, and model tuning. One plausible explanation is lack of adequate signal vs. noise with only clinical/dosimetric variables. Our future work will entail the inclusion of genomic variables with the goal of improving the signal-to-noise ratio and predictive performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call