Abstract
Cancer subtypes can improve our understanding of cancer, and suggest more precise treatment for patients. Multi-omics molecular data can characterize cancers at different levels. Up to now, many computational methods that integrate multi-omics data for cancer subtyping have been proposed. However, there are no consistent criteria to evaluate the integration methods due to the lack of gold standards (e.g., the number of subtypes in a specific cancer). Since comprehensive evaluation and comparison between different methods serves as a useful tool or guideline for users to select an optimal method for their own purpose, we develop a scalable platform, CEPICS, for comprehensively evaluating and comparing multi-omics data integration methods in cancer subtyping. Given a user-specified maximum number of subtypes, k-max, CEPICS provides (1) cancer subtyping results using up to five built-in state-of-the-art integration methods under the number of subtypes from two to k-max, (2) a report including the evaluation of each user-selected method and comparisons across them using clustering performance metrics and clinical survival analysis, and (3) an overall analysis of subtyping results by different methods representing a robust cancer subtype prediction for samples. Furthermore, users can upload subtyping results of their own methods to compare with the built-in methods. CEPICS is implemented as an R package and is freely available at https://github.com/GaoLabXDU/CEPICS.
Highlights
With the development of high-throughput technologies, huge amounts of multi-omics data for cancers have been generated, such as genomics, epigenomics, and transcriptomics data
CancerSubtypes (Xu et al, 2017) is an R package that can be used to obtain cancer subtyping results by several existing methods. It only implements basic evaluation on methods instead of their comparisons and requires users to specify the exact number of subtypes to run each method, which is not determined in advance
We introduce the framework of CEPICS and present its three different application scenarios based on genomics, transcriptomics, and epigenetics data
Summary
With the development of high-throughput technologies, huge amounts of multi-omics data for cancers have been generated, such as genomics, epigenomics, and transcriptomics data. Many computational methods that integrate multi-omics data for cancer subtyping have been proposed. These approaches can be mainly divided into two categories based on data modeling strategies (Bersanelli et al, 2016), Comparison, Evaluation Platform for Integration Methods including graph-based approaches (Hofree et al, 2013; Wang et al, 2016; Guo et al, 2018) and statistics-based approaches (Shen et al, 2009; Yuan et al, 2011; Kim et al, 2017). CancerSubtypes (Xu et al, 2017) is an R package that can be used to obtain cancer subtyping results by several existing methods. It only implements basic evaluation on methods instead of their comparisons and requires users to specify the exact number of subtypes to run each method, which is not determined in advance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.