Abstract

Accurate prediction of non-small cell lung cancer (NSCLC) prognosis after surgery remains challenging. The Cox proportional hazard (PH) model is widely used, however, there are some limitations associated with it. In this study, we developed novel neural network models called binned time survival analysis (DeepBTS) models using 30 clinico-pathological features of surgically resected NSCLC patients (training cohort, n = 1,022; external validation cohort, n = 298). We employed the root-mean-square error (in the supervised learning model, s- DeepBTS) or negative log-likelihood (in the semi-unsupervised learning model, su-DeepBTS) as the loss function. The su-DeepBTS algorithm achieved better performance (C-index = 0.7306; AUC = 0.7677) than the other models (Cox PH: C-index = 0.7048 and AUC = 0.7390; s-DeepBTS: C-index = 0.7126 and AUC = 0.7420). The top 14 features were selected using su-DeepBTS model as a selector and could distinguish the low- and high-risk groups in the training cohort (p = 1.86 × 10−11) and validation cohort (p = 1.04 × 10−10). When trained with the optimal feature set for each model, the su-DeepBTS model could predict the prognoses of NSCLC better than the traditional model, especially in stage I patients. Follow-up studies using combined radiological, pathological imaging, and genomic data to enhance the performance of our model are ongoing.

Highlights

  • Lung cancer is the fourth most commonly diagnosed cancer and the second most common cause of cancer-related death worldwide

  • In the clinical Big Data era, an approach using neural network can serve as alternatives to the Cox proportional hazards (PH) model that overcome the disadvantages of the latter

  • We developed a deep learning algorithm using a negative log likelihood (NLLH) cost function to predict the clinical outcomes in particular time intervals of non-small cell lung cancer (NSCLC) patients who received surgical resection by using clinico- pathological data, which is achievable in actual clinical practice

Read more

Summary

Introduction

Lung cancer is the fourth most commonly diagnosed cancer and the second most common cause of cancer-related death worldwide. These assumptions are difficult to be satisfied using real-world data, and their violation may lead to the creation of a false model[1], (2) the exact model formula for tied samples is not computationally efficient; Efron’s or Breslow’s approximations are employed to fit the model in a reasonable time These approximations are incapable of handling ties correctly and produce significantly different results depending on the frequency of ties[2]. In this paper, we present a novel neural network model using clinico-pathological variables for predicting the recurrence probabilities of NSCLC patients in time-series intervals after surgical resection. A novel feature selection method using the neural network model is proposed, which can be used to measure the effect of each variable on the model

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call