Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia

Nur Liana Ab Majid Nur Liana Ab Majid,Syahid Anuar Syahid Anuar

doi:10.37934/araset.36.1.189202

Nur Liana Ab Majid Nur Liana Ab Majid, Syahid Anuar Syahid Anuar

Open Access

https://doi.org/10.37934/araset.36.1.189202

Copy DOI

Abstract

Obesity among adolescent is a public health issue with increasing burden of disease. Predicting imbalanced health data with Machine Learning may introduce bias and lead to diminished model performance. Misclassification in healthcare data could lead to misdiagnosing a patient or failing to detect a health issue when it is present. The purpose of this study is to predict adolescent obesity using machine learning along with implementation of multiple approaches on the imbalanced dataset. This study used secondary dataset from National Health and Morbidity Survey 2017. Samples 13 – 17 years were selected for the classification. SPSS V26 was used for data pre-processing, data cleaning, and data analysis. Meanwhile, Python language used for prediction and evaluation of the models. Approaches on the imbalanced dataset including resampling method (Random Oversampling, Random Under-sampling) and hybrid method (SMOTE and ADASYN) were implemented. This dataset was used for the formation of predictive models on ML algorithm including Artificial Neural Network, Decision Tree, K-Nearest Neighbour, Logistic Regression, Naïve Bayes, Random Forest and Support Vector Machine. The performance of each model was evaluated and compared using accuracy, precision, recall, F- score and Area under the Curve (AUC). Random Oversampling approached with Decision Tree Algorithm performs the best with accuracy (91.35%), precision (0.93), recall (0.91), F- score (0.91) and AUC (0.91) for the prediction of obesity among adolescent in Malaysia. The presented ML model development workflow along with the imbalanced techniques can be adapted to other health survey-based studies and may be valuable for developing other clinical prediction models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia

Abstract

Talk to us

Similar Papers

More From: Journal of Advanced Research in Applied Sciences and Engineering Technology

Lead the way for us

Journal: Journal of Advanced Research in Applied Sciences and Engineering Technology	Publication Date: Dec 24, 2023
License type: cc-by-nc

Similar Papers

Machine Learning Methods Based on CT Features Differentiate G1/G2 From G3 Pancreatic Neuroendocrine Tumors
Hai-Yan Chen ... Guo-Liang Shao
Academic radiology | VOL. 31
Hai-Yan Chen, et. al.Hai-Yan Chen ... Guo-Liang Shao
04 Dec 2023
Academic radiology | VOL. 31

Machine learning algorithms for predicting mortality after coronary artery bypass grafting.
Amirmohammad Khalaji ... Seyed Hossein Ahmadi Tafti
Frontiers in Cardiovascular Medicine | VOL. 9
Amirmohammad Khalaji, et. al.Amirmohammad Khalaji ... Seyed Hossein Ahmadi Tafti
24 Aug 2022
Frontiers in Cardiovascular Medicine | VOL. 9

Risk factors for high CAD-RADS scoring in CAD patients revealed by machine learning methods: a retrospective study.
Yueli Dai ... Hong Zhou
PeerJ | VOL. 11
Yueli Dai, et. al.Yueli Dai ... Hong Zhou
03 Aug 2023
PeerJ | VOL. 11

Development of a Prediction Model and Corresponding Scoring Table for Postherpetic Neuralgia Using Six Machine Learning Algorithms: A Retrospective Study.
Zheng Lin ... Ping Lin
Pain and therapy | VOL. 13
Zheng Lin, et. al.Zheng Lin ... Ping Lin
04 Jun 2024
Pain and therapy | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine Learning Modelling for Imbalanced Dataset: Case Study of Adolescent Obesity in Malaysia

Abstract

Talk to us

Similar Papers

More From: Journal of Advanced Research in Applied Sciences and Engineering Technology