Identification of cancer subtypes by integrating multiple types of transcriptomics data with deep learning in breast cancer

Yang Guo,Zhanhuai Li,Xuequn Shang

doi:10.1016/j.neucom.2018.03.072

Yang Guo, Zhanhuai Li + Show 1 more

https://doi.org/10.1016/j.neucom.2018.03.072

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Abstract The identification of cancer subtypes is vital to advance the precision of cancer disease diagnosis and therapy. Several works had been done to integrate multiple types of genomics data to investigate cancer subtypes. However, (1) few of them particularly considered the intrinsic correlations in each type of data; (2) to the best of our knowledge, none of them considered transcriptome alternative splicing regulation in data integration. It has been demonstrated that many cancers are related to abnormal alternative splicing regulations in recent years. In this paper, we propose a hierarchical deep learning framework, HI-SAE, to integrate gene expression and transcriptome alternative splicing profiles data to identify cancer subtypes. We adopt the stacked autoencoder (SAE) neural network to learn high-level representations in each type of data, respectively, and then integrate all the learned high-level representations by another learning layer to learn more complex data representations. Based on the final learned data representations, we cluster patients into different cancer subtype groups. Comprehensive experiments based on TCGA breast cancer data demonstrate that our model provides an effective and useful approach to integrate multiple types of transcriptomics data to identify cancer subtypes and the transcriptome alternative splicing data offers distinguishable clues of cancer subtypes.

Full Text