The objective of this study is to develop a multimodal neural network (MMNN) model that analyzes clinical variables and MRI images of a soft tissue sarcoma (STS) patient, to predict overall survival and risk of distant metastases. We compare the performance of this MMNN to models based on clinical variables alone, radiomics models, and an unimodal neural network. We include patients aged 18 or older with biopsy-proven STS who underwent primary resection between January 1st, 2005, and December 31st, 2020 with complete outcome data and a pre-treatment MRI with both a T1 post-contrast sequence and a T2 fat-sat sequence available. A total of 9380 MRI slices containing sarcomas from 287 patients are available. Our MMNN accepts the entire 3D sarcoma volume from T1 and T2 MRIs and clinical variables. Gradient blending allows the clinical and image sub-networks to optimally converge without overfitting. Heat maps were generated to visualize the salient image features. Our MMNN outperformed all other models in predicting overall survival and the risk of distant metastases. The C-Index of our MMNN for overall survival is 0.77 and the C-Index for risk of distant metastases is 0.70. The provided heat maps demonstrate areas of sarcomas deemed most salient for predictions. Our multimodal neural network with gradient blending improves predictions of overall survival and risk of distant metastases in patients with soft tissue sarcoma. Future work enabling accurate subtype-specific predictions will likely utilize similar end-to-end multimodal neural network architecture and require prospective curation of high-quality data, the inclusion of genomic data, and the involvement of multiple centers through federated learning.
Read full abstract