Abstract

With the gradual application of precision medicine, data mining technology is widely used in the medical field. Along with the improvement of people's living standards, the incidence of digestive tract tumors is getting higher. Therefore, early detection and diagnosis of digestive tract tumors is of great significance to the prognosis of patients. In this paper, a data mining model is proposed based on the existing data of digestive tract tumors. The model is based on logistic regression algorithm and has been improved. It is used to establish a digestive tract tumor recognition model to try to identify early gastrointestinal cancer diseases. In the specific modeling calculation, the initial recognition model was first established using the improved logistic regression model. The initial model had a misrecognition rate of 6%, a missing ratio of 3%, a KS value of 0.606, and an AUC value of 0.863. After determining the initial model, ten sets of test data were applied to train the initial model, we using logistic regression, decision tree, and neural network to compare the results of each of the three methods. The result leads to the next training. After that, we get the final recognition model, which has a misrecognition rate of 5%, a missing ratio of 0%, a KS value of 0.653, and an AUC value of 0.886. From the results, the final model we established can better identify early digestive tract cancer diseases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call