Abstract

There is emerging evidence for the existence of intrinsic biological differences in cancer responsiveness to therapeutic radiation between African-American (AA) men compared to non-Hispanic White (NHW) men (Spratt et al. [abstract ASTRO 2018]). The basis of this racial difference in radiosensitivity is likely based on variations in the expression of genes encoding for radiation response pathways. For example, decreased double strand break repair gene expression is associated increased radiosensitivity in somatic and cancer cell lines. We hypothesized that a racial difference in the expression levels of genes participating in radiation response pathways could be identified via a machine learning approach. We extracted the gene expression level data of 7,470 patients from the Genomic Data Commons Pan-Cancer database who had race identified as AA (n=802) or NHW (n=6,668). For each patient, the expression levels of 741 genes that are known to be involved in radiation response pathways were selected for subsequent analysis. An ensemble of five machine learning methods (support vector machine (SVM), linear discriminant analysis (LDA), gradient boosted machine (GBM), Bayesian generalized linear model (BGLM), and sample mean (SM)) was trained on 80% of the data to predict for race based on this 741-gene expression panel. Out-of-sample error was estimated using 5-fold cross validation. The trained ensemble model was used to predict on the remaining 20% of the data. Performance of the ensemble model was evaluated via area under the curve (AUC) of the receiver operating characteristic curve. The mean squared error for the SVM, LDA, GBM, BGLM, and SM methods were 0.071, 0.075, 0.081, 0.076, and 0.096 respectively. The ensemble model achieved a mean square error of 0.068. Prediction by the ensemble model yielded an AUC of 0.861 (95% CI 0.844-0.878). Expression levels of radiation response pathway genes can be used to accurately identify race via an ensemble of machine learning models. This supports the emerging evidence that race may be associated with radiosensitivity via intrinsic biologic differences in gene expression levels. Further studies are warranted to investigate whether these gene expression differences translate to clinically detectable variation in radiosensitivity and tumor control among different patient populations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call