Abstract B1-08: Deep learning for the large-scale cancer data analysis

Shingo Tsuji,Hiroyuki Aburatani

doi:10.1158/1538-7445.compsysbio-b1-08

Abstract

Abstract The advanced cancer research fields have been producing vast amount of multi-omics data in various cancer types, and the most of the data are publicly available. It is unquestionably important to analyze these data with unbiased and generalized methodology for obtaining the deeper insights into cancer biology. Machine learning algorithms, such as k-means clustering, Support Vector Machines (SVM), and Random Forests (RF), have been applied to biological data analysis including cancer research. In this area, deep learning is gaining attention due to its high performance and generalized characteristics for analyzing complex data. To explore the possibilities for applying deep learning algorithms to the cancer multi-omics data analysis, we compared the performance among SVM, RF, and deep learning. We build the models for predicting cancer types of TCGA pan-cancer data set, and confirmed the advantages for deep learning. When we applied the algorithm to multi-omics data such as gene expression, DNA methylation, and somatic mutations, the input nodes of the neural network represented each data type. Autoencoders, components of the deep learning models, involve the weighted relationships among the input nodes, therefore we can find the unknown dependencies of some nodes by analyzing the networks of the model. The results suggested that the deep learning architecture would be promising methodology to progress the cancer research in the data explosion era. Citation Format: Shingo Tsuji, Hiroyuki Aburatani. Deep learning for the large-scale cancer data analysis. [abstract]. In: Proceedings of the AACR Special Conference on Computational and Systems Biology of Cancer; Feb 8-11 2015; San Francisco, CA. Philadelphia (PA): AACR; Cancer Res 2015;75(22 Suppl 2):Abstract nr B1-08.

Full Text