Abstract

Abstract Improving cancer patients Overall Survival (OS) prognosis is critical for personalization of treatment using model-identified drivers of cancer progression. Current cancer prognosis models largely rely on clinical and demographic patient characteristics. Adding ‘omics’-based modalities can help improve patient OS prediction and lead to better disease categorization and understanding. We introduce a data driven methodology for combining multi-omics and clinical data, including clinical/demographics, mutations, gene expression, long non-coding and micro-RNA expression, DNA methylation, and proteomics for improving prediction of OS in cancer patients. High dimensionality of ‘omics’ modalities present challenges to combining them into one model. We propose a late stage modalities fusion where we construct a separate data driven model for OS prediction for each modality, later combining individual predictions in a final linear OS prediction model. With a limited number of patients to develop the model, such an approach helps to better protect against overfitting, and allows to account for different degrees of informativeness of modalities by weighting them according to individual success. We introduce a robust machine learning pipeline with rigorous training, testing and evaluation capabilities, and demonstrate its effectiveness on a suit of TCGA data. When comparing early vs. late fusion of omics and clinical modalities for survival prediction using NSCLC TCGA data, we observe the C-index improvement from 0.57±.04 to 0.61±.01. Best individual modality performance was at 0.59±.02 using clinical modality. Dominant modalities in unimodal survival analysis varied between cancers, with clinical, RNA, and miRNA for LUAD, and clinical, RNA, and RPPA for LUSC. Using pan-cancer TCGA data for survival prediction, the best C-index = 0.77 was achieved using multi-omics model, followed by 0.76±.01 for clinical, 0.75±.01 for RNA seq, and 0.73±.01 for RPPA unimodal models. Citation Format: Nikos Nikolaou, Domingo Salazar, Harish RaviPrakash, Miguel Goncalves, Gustavo Alonso Arango Argoty, Nikolay Burlutsky, Natasha Markuzon, Etai Jacob. Improving survival prediction using flexible late fusion machine learning framework for multi-omics data integration. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5395.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call