Abstract
Because cancer is a leading cause of death and is thought to be caused by genetic errors or genomic instability in many circumstances, there have been studies exploring cancer's genetic basis using microarray and RNA-seq methods, linking gene expression data to patient survival. This research introduces a methodological framework, combining heterogeneous gene expression data, random forest selection, and pathway analysis, alongside clinical information and Cox regression analysis, to discover prognostic biomarkers. Heterogeneous gene expression data for colorectal cancer were collected from TCGA-COAD (RNA-seq), and GSE17536 and GSE39582 (microarray), and were integrated with Entrez Gene IDs. Using Cox regression analysis and random forest, genes with consistent hazard ratios and significantly affecting patient survivability were chosen. Predictive accuracy was evaluated using ROC curves. Pathway analysis identified potential RNA biomarkers. The authors identified 28 RNA biomarkers. Pathway analysis revealed enrichment in cancer-related pathways, notably EGFR downstream signaling and IGF1R signaling. Three RNA biomarkers (ZEB1-AS1, PI4K2A, and ITGB8-AS1) and two clinical biomarkers (stage and age) were chosen for a prognostic model, improving predictive performance compared to using clinical biomarkers alone. Despite biomarker identification challenges, this study underscores integration of heterogenous gene expression data for discovery.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.