Abstract
BackgroundIntegrating the rich information from multi-omics data has been a popular approach to survival prediction and bio-marker identification for several cancer studies. To facilitate the integrative analysis of multiple genomic profiles, several studies have suggested utilizing pathway information rather than using individual genomic profiles.MethodsWe have recently proposed an integrative directed random walk-based method utilizing pathway information (iDRW) for more robust and effective genomic feature extraction. In this study, we applied iDRW to multiple genomic profiles for two different cancers, and designed a directed gene-gene graph which reflects the interaction between gene expression and copy number data. In the experiments, the performances of the iDRW method and four state-of-the-art pathway-based methods were compared using a survival prediction model which classifies samples into two survival groups.ResultsThe results show that the integrative analysis guided by pathway information not only improves prediction performance, but also provides better biological insights into the top pathways and genes prioritized by the model in both the neuroblastoma and the breast cancer datasets. The pathways and genes selected by the iDRW method were shown to be related to the corresponding cancers.ConclusionsIn this study, we demonstrated the effectiveness of a directed random walk-based multi-omics data integration method applied to gene expression and copy number data for both breast cancer and neuroblastoma datasets. We revamped a directed gene-gene graph considering the impact of copy number variation on gene expression and redefined the weight initialization and gene-scoring method. The benchmark result for iDRW with four pathway-based methods demonstrated that the iDRW method improved survival prediction performance and jointly identified cancer-related pathways and genes for two different cancer datasets.ReviewersThis article was reviewed by Helena Molina-Abril and Marta Hidalgo.
Highlights
Integrating the rich information from multi-omics data has been a popular approach to survival prediction and bio-marker identification for several cancer studies
We investigated the impact of copy number variants (CNVs) on gene expression for two different cancer types: breast cancer and neuroblastoma, utilizing the Integrative directed random walk-based method (iDRW) method
To integrate the gene expression and copy number alteration data, we first constructed a directed gene-gene graph representing the impact of copy number variants on gene expression by defining weight initializations and gene scoring measures for each genomic profile
Summary
Integrating the rich information from multi-omics data has been a popular approach to survival prediction and bio-marker identification for several cancer studies. Most network-based methods have focused on incorporating pathway or subtype information rather than using individual genomic features in different types of cancer datasets [9,10,11,12,13,14,15,16,17,18]. In this respect, pathway-based methods have been proposed for the identification of important genes within pathways
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have