Abstract

Improved cancer prognosis is a central goal for precision health medicine. Though many models can predict differential survival from data, there is a strong need for sophisticated algorithms that can aggregate and filter relevant predictors from increasingly complex data inputs. In turn, these models should provide deeper insight into which types of data are most relevant to improve prognosis. Deep Learning-based neural networks offer a potential solution for both problems because they are highly flexible and account for data complexity in a non-linear fashion. In this study, we implement Deep Learning-based networks to determine how gene expression data predicts Cox regression survival in breast cancer. We accomplish this through an algorithm called SALMON (Survival Analysis Learning with Multi-Omics Neural Networks), which aggregates and simplifies gene expression data and cancer biomarkers to enable prognosis prediction. The results revealed improved performance when more omics data were used in model construction. Rather than use raw gene expression values as model inputs, we innovatively use eigengene modules from the result of gene co-expression network analysis. The corresponding high impact co-expression modules and other omics data are identified by feature selection technique, then examined by conducting enrichment analysis and exploiting biological functions, escalated the interpretation of input feature from gene level to co-expression modules level. Our study shows the feasibility of discovering breast cancer related co-expression modules, sketch a blueprint of future endeavors on Deep Learning-based survival analysis. SALMON source code is available at https://github.com/huangzhii/SALMON/.

Highlights

  • AND INTRODUCTIONThere is a strong need to identify effective prognostic biomarkers to help optimize and personalize treatment (Liu et al, 2016)

  • We advocate the use of eigengene matrices instead of original mRNA-seq data (mRNA)-seq and miRNAseq data derived from co-expression analysis with R package “local maximal Quasi-Clique Merger (lmQCM).” Using neural network architecture, multi-omics data, and the Cox proportional hazards model, we develop our model called SALMON (Survival Analysis Learning with Multi-Omics Neural Networks)

  • The experiments were performed with six different combinations of multi-omics data as input sources, they are: (i) mRNA-seq data (57 features); (ii) miRNA-seq data (12 features); (iii) integration of mRNA and miRNA (69 features); (iv) integration of mRNA, miRNA, copy number burden (CNB), and tumor mutation burden (TMB) (71 features); (v) integration of mRNA, miRNA, and demographical and clinical data (72 features); (vi) integration of mRNA, miRNA, CNB, TMB, and demographical and clinical data (74 features)

Read more

Summary

Introduction

AND INTRODUCTIONThere is a strong need to identify effective prognostic biomarkers to help optimize and personalize treatment (Liu et al, 2016). Breast invasive carcinoma is one of the most heterogeneous cancers with distinct prognoses based on morphological, phenological, and molecular stratifications (Nagini, 2017; Wu et al, 2017). Breast invasive carcinoma patients have a 77% survival rate after 5 years and 44% survival rate after 15 years (Pereira et al, 2016), so developing accurate prognostic models could significantly improve risk stratification after diagnosis. Recent Deep Learning-based approaches have been widely applied to Computational Biology and Bioinformatics (Huang et al, 2017; Zhang et al, 2018b). The advantages of learning nonlinear functions and retrieving lower dimensional representation (Ching et al, 2018) reveal advances of Deep Learning models. The application of survival prognosis that incorporates Cox proportional hazards regression with a single transcriptomic dataset (Ching et al, 2018; Katzman et al, 2018; Shao et al, 2018) and with multi-omics data (Chaudhary et al, 2018; Poirion et al, 2018; Ramazzotti et al, 2018; Sun et al, 2018; Zhang et al, 2018a) is of major interest in precision health

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call