Abstract

The development of reliable methods for identification of robust biomarkers for complex diseases is critical for disease diagnosis and prognosis efforts. Integrating multi-omics data with protein-protein interaction (PPI) networks to investigate diseases may help better understand disease characteristics at the molecular level. In this study, we developed and tested a novel network-based method to detect subnetwork markers for patients with colorectal cancer (CRC). We performed an integrated omics analysis using whole-genome gene expression profiling and copy number alterations (CNAs) datasets followed by building a gene interaction network for the significantly altered genes. We then clustered the constructed gene network into subnetworks and assigned a score for each significant subnetwork. We developed a support vector machine (SVM) classifier using these scores as feature values and tested the methodology in independent CRC transcriptomic datasets. The network analysis resulted in 15 subnetwork markers that revealed several hub genes that may play a significant role in colorectal cancer, including PTP4A3, FGFR2, PTX3, AURKA, FEN1, INHBA, and YES1. The 15-subnetwork classifier displayed over 98 percent accuracy in detecting patients with CRC. In comparison to individual gene biomarkers, subnetwork markers based on integrated multi-omics and network analyses may lead to better disease classification, diagnosis, and prognosis.

Highlights

  • We obtained 91 significant genes identified in CNA regions from (Eldai et al, 2013) and performed Venn diagram approach to identify overlapping significant messenger RNA (mRNA) that have concomitant copy number alterations (Figure 1B)

  • We proposed an integrated omics and network-based methodology to identify subnetwork markers

  • We applied our method to investigate colorectal cancer data from Saudi patients and identified 15-subnetwork markers that are associated with the disease and validated its diagnostic and prognostic potential using independent datasets

Read more

Summary

Introduction

Artificial intelligence (AI) and Machine learning (ML) approaches have been widely used to investigate the disease diagnosis and predict the outcome (Maciukiewicz et al, 2018; Lai et al, 2019; Eicher et al, 2020; Jamal et al, 2020; Sanchez and Mackenzie, 2020; Sinkala et al, 2020; Stafford et al, 2020; Toraih et al, 2020). Sanchez et al identified methylation biomarkers for leukemia by investigating PPI for differentially methylated genes (DMGs) and differentially expressed genes (DEGs) using machine learning approach (Sanchez and Mackenzie, 2020). The authors reported that the identified biomarkers are reliable and associated with cancer development and risk (Sanchez and Mackenzie, 2020). Tabl et al proposed a hierarchical machine learning system to develop biomarkers that can support the identification of the best therapy for breast cancer patients based on their gene expression and clinical data that achieved a high classification accuracy (Tabl et al, 2019). Sinkala et al applied machine learning algorithms coupled with integrative profiling of multiple data types to identify biomarkers that can differentiate between pancreatic cancer subtypes (Sinkala et al, 2020)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.