Abstract

Background and ObjectiveThe morbidity of lung adenocarcinoma (LUAD) has been increasing year by year and the prognosis is poor. This has prompted researchers to study the survival of LUAD patients to ensure that patients can be cured in time or survive after appropriate treatment. There is still no fully valid model that can be applied to clinical practice. MethodsWe introduced struc2vec-based multi-omics data integration (SBMOI), which could integrate gene expression, somatic mutations and clinical data to construct mutation gene vectors representing LUAD patient features. Based on the patient features, the random survival forest (RSF) model was used to predict the long- and short-term survival of LUAD patients. To further demonstrate the superiority of SBMOI, we simultaneously replaced scale-free gene co-expression network (FCN) with a protein-protein interaction (PPI) network and a significant co-expression network (SCN) to compare accuracy in predicting LUAD patient survival under the same conditions. ResultsOur results suggested that compared with SCN and PPI network, the FCN based SBMOI combined with RSF model had better performance in long- and short-term survival prediction tasks for LUAD patients. The AUC of 1-year, 5-year, and 10-year survival in the validation dataset were 0.791, 0.825, and 0.917, respectively. ConclusionsThis study provided a powerful network-based method to multi-omics data integration. SBMOI combined with RSF successfully predicted long- and short-term survival of LUAD patients, especially with high accuracy on long-term survival. Besides, SBMOI algorithm has the potential to combine with other machine learning models to complete clustering or stratificational tasks, and being applied to other diseases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call