Abstract

Lung cancer is one of the malignant tumors with the greatest threat to human health, and studies have shown that some genes play an important regulatory role in the occurrence and development of lung cancer. In this paper, a LightGBM ensemble learning method is proposed to construct a prognostic model based on immune relate gene (IRG) profile data and clinical data to predict the prognostic survival rate of lung adenocarcinoma patients. First, this method used the Limma package for differential gene expression, used CoxPH regression analysis to screen the IRG to prognosis, and then used XGBoost algorithm to score the importance of the IRG features. Finally, the LASSO regression analysis was used to select IRG that could be used to construct a prognostic model, and a total of 17 IRG features were obtained that could be used to construct model. LightGBM was trained according to the IRG screened. The K-means algorithm was used to divide the patients into three groups, and the area under curve (AUC) of receiver operating characteristic (ROC) of the model output showed that the accuracy of the model in predicting the survival rates of the three groups of patients was 96%, 98% and 96%, respectively. The experimental results show that the model proposed in this paper can divide patients with lung adenocarcinoma into three groups [5-year survival rate higher than 65% (group 1), lower than 65% but higher than 30% (group 2) and lower than 30% (group 3)] and can accurately predict the 5-year survival rate of lung adenocarcinoma patients.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call