Abstract

The age of precision medicine demands powerful computational techniques to handle high-dimensional patient data. We present MultiSurv, a multimodal deep learning method for long-term pan-cancer survival prediction. MultiSurv uses dedicated submodels to establish feature representations of clinical, imaging, and different high-dimensional omics data modalities. A data fusion layer aggregates the multimodal representations, and a prediction submodel generates conditional survival probabilities for follow-up time intervals spanning several decades. MultiSurv is the first non-linear and non-proportional survival prediction method that leverages multimodal data. In addition, MultiSurv can handle missing data, including single values and complete data modalities. MultiSurv was applied to data from 33 different cancer types and yields accurate pan-cancer patient survival curves. A quantitative comparison with previous methods showed that Multisurv achieves the best results according to different time-dependent metrics. We also generated visualizations of the learned multimodal representation of MultiSurv, which revealed insights on cancer characteristics and heterogeneity.

Highlights

  • The age of precision medicine demands powerful computational techniques to handle highdimensional patient data

  • Other methods focused on imaging data, such as CXR-risk, which uses chest r­ adiographs15, ­LungNet[16] and a gastric cancer survival prediction ­model[17], which use computed tomography (CT) images, a nasopharyngeal carcinoma survival prediction m­ odel[18], which uses magnetic resonance imaging (MRI) data, and ­WSISA19, which employs histopathology slides

  • MultiSurv uses a set of six data modalities with potential complementary predictive value in cancer survival

Read more

Summary

Introduction

The age of precision medicine demands powerful computational techniques to handle highdimensional patient data. The most basic cancer prognosis prediction technique relies on population-level estimates for the specific cancer site and stage This method fails to take into account the differences between individual patients, even such fundamental ones as age at diagnosis. The classical statistical approach to model survival data with censored observations is the semi-parametric Cox proportional hazards (CPH) ­model[6]. This method is widely ­used[7], but has two important limitations. With the advent of big data collection for precision medicine, a wealth of new data modalities are increasingly available in routine clinical practice Integration of such big data demands powerful modeling approaches, reinforcing the call for DL-based ­methods[9,11]. Other methods focused on imaging data, such as CXR-risk, which uses chest r­ adiographs15, ­LungNet[16] and a gastric cancer survival prediction ­model[17], which use computed tomography (CT) images, a nasopharyngeal carcinoma survival prediction m­ odel[18], which uses magnetic resonance imaging (MRI) data, and ­WSISA19, which employs histopathology slides

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call