Abstract

A disease can be characterized by various attributes such as genomic, epigenetic, and transcriptomic features beyond physiological symptoms. The accumulation of vast datasets allows us to investigate the relative effectiveness of each omics data and their combinations for in silico analysis of diseases. Here, we employed a classification method with the well-established measure of information gain for the computational analysis of the effect of the aggregation of omics data, especially for the task of in silico classification of tumor-normal samples for bladder urothelial carcinoma and kidney renal papillary cell carcinoma. We observed that the combination of multi-omics data such as copy number variation, DNA methylation, RNA-Seq, and somatic mutations have beneficial effects. The quantitative analysis using information gain and various measures for classification-performance showed that the combination of multiple omics data improved the performance in general. The qualitative analysis referring previous researches also confirmed the relevance of genes with higher information gain to target diseases. Our results report that the combination of multiple omics data is beneficial and the information gain which focuses on the distribution of attributes across target domains could be useful as an indicator of the effect of each omics data on tumor-normal sample classification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.