Abstract

Simple SummaryRobust methods for modelling and estimation of cancer survival could be relevant in understanding and limiting the impact of cancer. This study was aimed at developing an efficient Machine learning (ML) pipeline that could model survival in Lung Adenocarcinoma (LUAD) patients. Image transformations of multi omics data were employed for training a machine vision-based model capable of segregating patients into high-risk and low-risk subgroups. The performance was evaluated using concordance index, Brier score, and other similar metrices. The proposed model was able to outperform similar methods with a high degree of confidence. Furthermore, critical modules in cell cycle and pathways were also identified.The utility of multi-omics in personalized therapy and cancer survival analysis has been debated and demonstrated extensively in the recent past. Most of the current methods still suffer from data constraints such as high-dimensionality, unexplained interdependence, and subpar integration methods. Here, we propose SurvCNN, an alternative approach to process multi-omics data with robust computer vision architectures, to predict cancer prognosis for Lung Adenocarcinoma patients. Numerical multi-omics data were transformed into their image representations and fed into a Convolutional Neural network with a discrete-time model to predict survival probabilities. The framework also dichotomized patients into risk subgroups based on their survival probabilities over time. SurvCNN was evaluated on multiple performance metrics and outperformed existing methods with a high degree of confidence. Moreover, comprehensive insights into the relative performance of various combinations of omics datasets were probed. Critical biological processes, pathways and cell types identified from downstream processing of differentially expressed genes suggested that the framework could elucidate elements detrimental to a patient’s survival. Such integrative models with high predictive power would have a significant impact and utility in precision oncology.

Highlights

  • This work successfully demonstrated that numerical multi-omics data, transformed into their image representations, could extract meaningful information about the individual’s genomic profile

  • These representations of genomic information simplify the process of identifying genetic clusters that are coregulated in a diseased individual and predict its likelihood of survival in terms of hazard ratios

  • Though we have primarily focused on its application on cancer prognosis detection, one can extrapolate the algorithms for use in a wide variety of applications

Read more

Summary

Introduction

It has led to quantifying diverse omics-biomarkers in a clinically and economically feasible manner for an individual This flexibility has allowed scientists to build robust personalized, predictive models using the generated data to evaluate prognostic dependencies such as clinical outcomes and probability of relapse [4,5,6,7]. Resources such as The Cancer Genome Atlas (TCGA), the International Cancer Genomics Consortium (ICGC), and the Cancer Cell Line Encyclopedia (CCLE), among others, have standardized the process of curating and hosting data from multiple studies, making them accessible [8,9,10]. A study suggested that most LUAD patients tend to be non-smokers, contrary to the general perception of a smoking-related basis to lung cancers [12]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call