Diabetes Prediction Using Derived Features and Ensembling of燘oosting燙lassifiers

R Rajkamal,Xiao-Zhi Gao,Anitha Karthi

doi:10.32604/cmc.2022.027142

R Rajkamal, Xiao-Zhi Gao + Show 1 more

Open Access

https://doi.org/10.32604/cmc.2022.027142

Copy DOI

Abstract

Diabetes is increasing commonly in people’s daily life and represents an extraordinary threat to human well-being. Machine Learning (ML) in the healthcare industry has recently made headlines. Several ML models are developed around different datasets for diabetic prediction. It is essential for ML models to predict diabetes accurately. Highly informative features of the dataset are vital to determine the capability factors of the model in the prediction of diabetes. Feature engineering (FE) is the way of taking forward in yielding highly informative features. Pima Indian Diabetes Dataset (PIDD) is used in this work, and the impact of informative features in ML models is experimented with and analyzed for the prediction of diabetes. Missing values (MV) and the effect of the imputation process in the data distribution of each feature are analyzed. Permutation importance and partial dependence are carried out extensively and the results revealed that Glucose (GLUC), Body Mass Index (BMI), and Insulin (INS) are highly informative features. Derived features are obtained for BMI and INS to add more information with its raw form. The ensemble classifier with an ensemble of AdaBoost (AB) and XGBoost (XB) is considered for the impact analysis of the proposed FE approach. The ensemble model performs well for the inclusion of derived features provided the high Diagnostics Odds Ratio (DOR) of 117.694. This shows a high margin of 8.2% when compared with the ensemble model with no derived features (DOR = 96.306) included in the experiment. The inclusion of derived features with the FE approach of the current state-of-the-art made the ensemble model performs well with Sensitivity (0.793), Specificity (0.945), DOR (79.517), and False Omission Rate (0.090) which further improves the state-of-the-art results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computers, Materials & Continua	Publication Date: Jan 1, 2022
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Diabetes Prediction Using Derived Features and Ensembling of燘oosting燙lassifiers

Abstract

Talk to us

Similar Papers

More From: Computers, Materials & Continua

Lead the way for us

Similar Papers

Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers
Md Kamrul Hasan ... Eklas Hossain
IEEE Access | VOL. 8
Md Kamrul Hasan, et. al.Md Kamrul Hasan ... Eklas Hossain
01 Jan 2020
IEEE Access | VOL. 8

Heart disease prediction using entropy based feature engineering and ensembling of machine learning classifiers
Rajkamal Rajendran ... Anitha Karthi
Expert Systems with Applications | VOL. 207
Rajkamal Rajendran, et. al.Rajkamal Rajendran ... Anitha Karthi
21 Jun 2022
Expert Systems with Applications | VOL. 207

A risk assessment and prediction framework for diabetes mellitus using machine learning algorithms
Salliah Shafi Bhat ... Venkatesan Selvam
Healthcare Analytics | VOL. 4
Salliah Shafi Bhat, et. al.Salliah Shafi Bhat ... Venkatesan Selvam
23 Oct 2023
Healthcare Analytics | VOL. 4

Machine learning for faster estimates of groundwater response to artificial aquifer recharge
Valdrich J Fernandes ... Coen J Ritsema
Journal of Hydrology | VOL. 637
Valdrich J Fernandes, et. al.Valdrich J Fernandes ... Coen J Ritsema
25 May 2024
Journal of Hydrology | VOL. 637

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Diabetes Prediction Using Derived Features and Ensembling of燘oosting燙lassifiers

Abstract

Talk to us

Similar Papers

More From: Computers, Materials &amp; Continua

More From: Computers, Materials & Continua