Abstract

Background and aimEsophageal cancer (EC) is a highly prevalent and progressive disease. Early prediction of EC risk in the population is crucial in preventing this disease and enhancing the overall health of individuals. So far, few studies have been conducted on predicting the EC risk based on the prediction models, and most of them focused on statistical methods. The ML approach obtained efficient predictive insights into the clinical domain. Therefore, this study aims to develop a risk prediction model for EC based on risk factors and by leveraging the ML approach to stratify the high-risk EC people and obtain efficient preventive purposes at the community level. Material and methodsThe current retrospective study was performed from 2018 to 2022 in Sari City based on 3256 EC and non-EC cases. The six selected algorithms, including Random Forest (RF), eXtreme Gradient Boosting (XG-Boost), Bagging, K-Nearest Neighbor (K-NN), Support Vector Machine (SVM), and Artificial Neural Networks (ANNs), were used to develop the risk prediction model for EC and achieve the preventive purposes. ResultsComparing the performance efficiency of algorithms revealed that the XG-Boost model gained the best predictability for EC risk with AU-ROC = 0.92 and AU-ROC-test = 0.889 for internal and validation states, respectively. Based on the XG-Boost, the factors, including sex, drinking hot liquids, fruit consumption, achalasia, and vegetable consumption, were considered the five top predictors of EC risk. ConclusionThis study showed that the XG-Boost could provide insight into the early prediction of the EC risk for people and clinical providers to stratify the high-risk group of EC and achieve preventive measures based on modifying the risk factors associated with EC and other clinical solutions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call