Abstract

BackgroundIdentification of cancer subtypes is of great importance to facilitate cancer diagnosis and therapy. A number of methods have been proposed to integrate multi-sources data to identify cancer subtypes in recent years. However, few of them consider the regulatory associations between genome features and the contribution weights of different data-views in data integration. It is widely accepted that the regulatory associations between features play important roles in cancer subtype studies. In addition, different data-views may have different contributions in data integration for cancer subtype prediction.ResultsIn this paper, we propose a method, CSPRV, to improve the cancer subtype prediction by incorporating multi-sources transcriptome expression data and heterogeneous biological networks. We extract multiple expression features of each genome element based on the regulatory associations in the heterogeneous biological networks and use a generalized matrix correlation method (RV2) to predict the similarities between samples in each view of expression data. We fuse the similarity information in multiple data-views according to different integration weights. Based on the integrated similarities between samples, we cluster samples into different subtype groups. Comprehensive experiments on TCGA cancer datasets demonstrate that the proposed method can identify more clinically meaningful cancer subtypes comparing with most existing methods.ConclusionsThe consideration of regulatory associations between biological features and data-views contribution is important to improve the understanding of cancer subtypes. The proposed method provides an open framework to incorporate transcriptome expression data and biological regulation network to predict cancer subtypes.

Highlights

  • Identification of cancer subtypes is of great importance to facilitate cancer diagnosis and therapy

  • Many computational approaches were proposed to take the advantage of multiple types of cancer data to detect more clinically meaningful cancer subtypes [8, 11,12,13,14], such as iCluster [11, 13], CNMF [13, 15], SNF [8], WSNF [5] and ANF [16], etc. iCluster is a machine learning method to identify subtype clusters from multiple data sources by using EM algorithm, whereas feature selection is usually necessary for it works on high-dimensional data

  • WSNF incorporates the mRNA-TF-miRNA regulatory network information to predict the importance of each feature, and to identify cancer subtypes using SNF framework based on the weighted similarity information between samples

Read more

Summary

Introduction

Identification of cancer subtypes is of great importance to facilitate cancer diagnosis and therapy. WSNF incorporates the mRNA-TF-miRNA regulatory network information to predict the importance of each feature, and to identify cancer subtypes using SNF framework based on the weighted similarity information between samples. These integrative methods had been proven to be effective in subtype prediction, they did not consider the data-view weight in data integration, while different data-views may have different contributions to subtype prediction. The heterogeneous biological regulatory network includes the relationships between features and it hopes to improve the subtype prediction by incorporating the network information in data integration, since different regulatory mechanisms may exist in different cancer subtypes

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.