Abstract

BackgroundColon cancer is common worldwide and is the leading cause of cancer-related death. Multiple levels of omics data are available due to the development of sequencing technologies. In this study, we proposed an integrative prognostic model for colon cancer based on the integration of clinical and multi-omics data.MethodsIn total, 344 patients were included in this study. Clinical, gene expression, DNA methylation and miRNA expression data were retrieved from The Cancer Genome Atlas (TCGA). To accommodate the high dimensionality of omics data, unsupervised clustering was used as dimension reduction method. The bias-corrected Harrell’s concordance index was used to verify which clustering result provided the best prognostic performance. Finally, we proposed a prognostic prediction model based on the integration of clinical data and multi-omics data. Uno’s concordance index with cross-validation was used to compare the discriminative performance of the prognostic model constructed with different covariates.ResultsCombinations of clinical and multi-omics data can improve prognostic performance, as shown by the increase of the bias-corrected Harrell’s concordance of the prognostic model from 0.7424 (clinical features only) to 0.7604 (clinical features and three types of omics features). Additionally, 2-year, 3-year and 5-year Uno’s concordance statistics increased from 0.7329, 0.7043, and 0.7002 (clinical features only) to 0.7639, 0.7474 and 0.7597 (clinical features and three types of omics features), respectively.ConclusionIn conclusion, this study successfully combined clinical and multi-omics data for better prediction of colon cancer prognosis.

Highlights

  • Colon cancer is common worldwide and is the leading cause of cancer-related death

  • Data preparation Normalized and preprocessed clinical data and omics data of primary tumors included in the The Cancer Genome Atlas (TCGA)-Colon Adenocarcinoma (COAD) project were downloaded from the new TCGA data portal with the provided datatransfer tool

  • Results of Omics data processing Overall, eight combinations of distance methods, linkage methods and cluster numbers were identified for clustering of different types of omics data while combining with clinical features, including two combinations for gene expression, three combinations for DNA methylation and three combinations for miRNA expression

Read more

Summary

Introduction

Colon cancer is common worldwide and is the leading cause of cancer-related death. Multiple levels of omics data are available due to the development of sequencing technologies. Colon cancer, which is a subset of colorectal cancer (CRC), is common worldwide and is the leading cause of cancer-related death. Omics data have been widely used for cancer classification based on identified gene signatures, gene pathways, and protein-protein interaction networks, among others [3,4,5]. Such classifications can help oncologists provide more accurate treatment regimens for individuals.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.