Abstract

Gastric cancer is predominantly caused by demographic-diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (EGC) factors from diet and lifestyle characteristics of Mizo-ethnicity using supervised machine learning algorithms. For this study, 80 cases and 160 controls are selected and a dataset containing 11 features that are core risk factors for the gastric cancer have been chosen for data mining. The learning curves show Naive Bayes, Logistic Regression and Multilayer perceptron are the best fit classification algorithms for our dataset. Data models are constructed and evaluated using: brier score, accuracy, precision_recall curves for cases (patients) and controls (healthy individuals), and false positives. The data interpretation shows Naive Bayes has the highest classification results having an accuracy of 90%, with the lowest Brier score of 0.1, and a false positive rate of 3% as compared to other models. Logistic regression classifier shows equally good performances with setback in brier_score and false positives. This study found that extra salt, tuibur, smoking and alcohol are the non_invasive etiological factors for gastric cancer in Mizoram population as predicted by the Naive Bayes algorithm. This knowledge will be helpful for initiating early screening and to educate the public about the risk of dietary and lifestyle factors in high risk population with unique habits.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call