Abstract
Colorectal cancer (CRC) is a complex disease with diverse genetic alterations and causes 10 % of cancer-related deaths worldwide. Understanding its molecular mechanisms is essential for identifying potential biomarkers and therapeutic targets for its effective management. We integrated copy number alterations (CNA) and mutation data via their differentially expressed genes termed as candidate genes (CGs) computed using bioinformatics approaches. Then, using the CGs, we perform Weighted correlation network analysis (WGCNA) and utilise several hazard models such as Univariate Cox, Least Absolute Shrinkage and Selection Operator (LASSO) Cox and multivariate Cox to identify the key genes involved in CRC progression. We used different machine-learning models to demonstrate the discriminative power of selected hub genes among normal and CRC (early and late-stage) samples. The integration of CNA with mRNA expression identified over 3000 CGs, including CRC-specific driver genes like MYC and APC. In addition, pathway analysis revealed that the CGs are mainly enriched in endocytosis, cell cycle, wnt signalling and mTOR signalling pathways. Hazard models identified four key genes, CASP2, HCN4, LRRC69 and SRD5A1, that were significantly associated with CRC progression and predicted the 1-year, 3-years, and 5-years survival times. WGCNA identified seven hub genes: DSCC1, ETV4, KIAA1549, NOP56, RRS1, TEAD4 and ANKRD13B, which exhibited strong predictive performance in distinguishing normal from CRC (early and late-stage) samples. Integrating regulatory information with gene expression improved early versus late-stage prediction. The identified potential prognostic and diagnostic biomarkers in this study may guide us in developing effective therapeutic strategies for CRC management.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have