Abstract

Abstract Most non-small cell lung cancer (NSCLC) prognosis prediction approaches use one data type and do not take advantage of the large amount of multimodal data available. To evaluate and explore the benefits of multimodal data integration, we present a combined feature selection and denoising autoencoder pipeline for NSCLC survival prediction and survival subtype identification using microRNA (miRNA), mRNA, DNA methylation, long non-coding RNA (lncRNA) and clinical data. Survival performance for both lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) patients was compared across modality combinations, data integration time and training data types. Multimodal data combinations outperformed single data modalities, with the early integration of all data modalities achieving concordance indexes (C-indexes) of 0.67 (±.04) and 0.63 (±.02) for LUAD and LUSC, respectively versus corresponding C-index of 0.64 (±.02) and 0.59 (±.03) for the best single cell modality (clinical). Notably, combining just lncRNA and clinical data facilitated effective survival discrimination, with C-indexes of 0.69 (±.03) for LUAD and 0.62 (±.03) for LUSC. Overall, higher performance was achieved by using a single denoising autoencoder for all biological data (early integration) and by training on both LUSC and LUAD patient data together. Two survival subtypes (log rank test p-value=1e-9) were identified, with 991 differentially expressed transcripts in the poorer survival group. Our analysis shows the value of multimodal data integration for predicting NSCLC progression, with especially good performance using the combination of lncRNA and clinical data. Early integration of biological data, with an initial linear feature selection technique and a denoising autoencoder for dimensionality reduction, showed effective survival performance and survival subtype identification. Further research is underway to expand analysis to different cancer types and data modalities and extract more biological interpretability from autoencoder models. Citation Format: Jacob G. Ellen, Etai Jacob, Nikos Nikolaou, Natasha Markuzon. Autoencoder-based multimodal prediction of survival for non-small cell lung cancer. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5373.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.