Abstract
RNA sequencing (RNAseq) is a recent technology that profiles gene expression by measuring the relative frequency of the RNAseq reads. RNAseq read counts data is increasingly used in oncologic care and while radiology features (radiomics) have also been gaining utility in radiology practice such as disease diagnosis, monitoring, and treatment planning. However, contemporary literature lacks appropriate RNA-radiomics (henceforth, radiogenomics) joint modeling where RNAseq distribution is adaptive and also preserves the nature of RNAseq read counts data for glioma grading and prediction. The Negative Binomial (NB) distribution may be useful to model RNAseq read counts data that addresses potential shortcomings. In this study, we propose a novel radiogenomics-NB model for glioma grading and prediction. Our radiogenomics-NB model is developed based on differentially expressed RNAseq and selected radiomics/volumetric features which characterize tumor volume and sub-regions. The NB distribution is fitted to RNAseq counts data, and a log-linear regression model is assumed to link between the estimated NB mean and radiomics. Three radiogenomics-NB molecular mutation models (e.g., IDH mutation, 1p/19q codeletion, and ATRX mutation) are investigated. Additionally, we explore gender-specific effects on the radiogenomics-NB models. Finally, we compare the performance of the proposed three mutation prediction radiogenomics-NB models with different well-known methods in the literature: Negative Binomial Linear Discriminant Analysis (NBLDA), differentially expressed RNAseq with Random Forest (RF-genomics), radiomics and differentially expressed RNAseq with Random Forest (RF-radiogenomics), and Voom-based count transformation combined with the nearest shrinkage classifier (VoomNSC). Our analysis shows that the proposed radiogenomics-NB model significantly outperforms (ANOVA test, p < 0.05) for prediction of IDH and ATRX mutations and offers similar performance for prediction of 1p/19q codeletion, when compared to the competing models in the literature, respectively.
Highlights
Radiomics is increasingly being applied to radiology practice in disease diagnosis, grading, monitoring, and treatment planning [1, 2]
The dataset in this study consists of 108 pre-operative lower grade glioma (LGG) patients that are described in Menze et al [28], Bakas et al [29], and Bakas et al [30]
The dataset provides the segmented subregions of the LGG: Gadolinium enhancing tumor (ET), the peritumoral edema (ED), and necrosis along with non-enhancing tumor (NCR/NET)
Summary
Radiomics is increasingly being applied to radiology practice in disease diagnosis, grading, monitoring, and treatment planning [1, 2]. Fusing the important radiomics and genomics information in the proper computational machine learning (ML) model may helpto achieve a more comprehensive disease diagnosis, prognosis, and treatment planning scheme [3,4,5]. In order to alleviate the lack of appropriate ML models, researchers propose to transform the RNAseq read-count data to approximate a normal distribution. The transformation to normal distribution allows the use of existing methods such as the nearest shrinkage method [12, 13] or Random Forest for classification Such transformation removes the count-based nature of the RNAseq read counts data, and lacks the ability to fully preserve the strong mean-variance relationship that is otherwise useful for glioma classification and prediction [14, 15]. NB is similar to a Poisson distribution with an additional parameter called “dispersion” that allows the NB distribution to modify its variance without affecting the mean
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.