Abstract

Smoking remains a persistent global public health challenge, posing severe health risks and socioeconomic burdens. Early detection of smokers is crucial for implementing targeted interventions, public health campaigns, and personalized support to mitigate smoking-related illnesses. The dataset used in this research comprises extensive information on individual health, encompassing age, height, weight, waist circumference, visual acuity in both eyes, hearing capability in both ears, systolic and diastolic blood pressure, cholesterol levels (HDL, LDL, and triglycerides), haemoglobin levels, urine protein content, liver enzymes (AST, ALT, and GTP), presence of dental caries, and fasting blood sugar. Four For the smoking identification job, cutting-edge machine learning methods including logic regression, Gaussian Naive Bayes, Random Forest Classifier, and XGBoost Classifier are used. Each algorithm is tested using a variety of efficiency indicators after being trained on the data set used for training., including accuracy, ROC-AUC scores, and confusion matrices.The outcomes show the efficiency of the Random Forest Classifier, demonstrating an excellent precision of around 79.4% in correctly identifying smokers. This model outperforms the other algorithms and proves to be a robust approach for smoker detection based on the provided health parameters. Furthermore, the study delves into the interpretability of the models, analyzing the significance of different health parameters in predicting smoking behaviour. Insights gained from the feature importance analysis offer valuable guidance for public health practitioners and policymakers in designing targeted interventions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call