Abstract

Numerous studies have been conducted to elucidate the relation of tumor proximity to cancer prognosis and treatment efficacy in colorectal cancer. However, the molecular pathways and prognoses of left- and right-sided colorectal cancers are different, and this difference has not been fully investigated at the genomic level. In this study, a set of data science approaches, including six feature selection methods and three classification models, were used in predicting tumor location from gene expression profiles. Specificity, sensitivity, accuracy, and Mathew's correlation coefficient (MCC) evaluation metrics were used to evaluate the classification ability. Gene ontology enrichment analysis was applied by the Gene Ontology PANTHER Classification System. For the most significant 50 genes, protein-protein interactions and drug-gene interactions were analyzed using the GeneMANIA, CytoScape, CytoHubba, MCODE, and DGIdb databases. The highest classification accuracy (90%) is achieved with the most significant 200 genes when the ensemble-decision tree classification model is used with the ReliefF feature selection method. Molecular pathways and drug interactions are investigated for the most significant 50 genes. It is concluded that a machine-learning-based approach could be useful to discover the significant genes that may have an important role in the development of new therapies and drugs for colorectal cancer.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.