Data Imputation-Based Learning Models for Prediction of Diabetes

Dilip Singh Sisodia,Reenu Agrawal

doi:10.1109/dasa51403.2020.9317070

Abstract

The accuracy of automated diabetes prediction models using the past health record of the patient is highly dependent on the correctness of the used data. If patient data is inconsistent and contains lots of missing values, then the prediction is more challenging. In this paper, the impact of missing value imputation (MVI) techniques is evaluated in diabetes prediction with existing missing values. The experiments are performed on the Pima Indians diabetes dataset, which contains many missing values. In this paper, first, MVI techniques are used for handling the missing values. Second, K-Means clustering is used to analyze the best imputation technique based on the percentage of incorrectly classified instances in each imputed dataset. Third, principal component analysis (PCA) is used for feature extraction, and Info Gain is used for selecting the optimal set of features. Six different classification models, such as multi-layer perceptron (MLP), support vector machine (SVM), Naive Bayes (NB), decision tree (J48), AdaBoost, and Bagging are used for experiments. Eight different techniques such as CMC, Case Deletion, KMI, SVMI, WKNNI, KNNI, FKMI, and MC are used for missing value imputation. The experimental result shows that case deletion and KMI imputed datasets have the lowest number of incorrectly classified instances. On these two datasets, when to six classifiers are applied, we obtained that MLP classifier attained the highest accuracy of 98.9967 % with the case deletion imputed dataset and accuracy of 99.2767% with the KMI imputed dataset when six principal components are used. The other classifiers used in comparison obtained accuracies ranging between 93% - 98%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Imputation-Based Learning Models for Prediction of Diabetes

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Hybrid prediction model with missing value imputation for medical data
Archana Purwar ... Sandeep Kumar Singh
Expert Systems with Applications | VOL. 42
Archana Purwar, et. al.Archana Purwar ... Sandeep Kumar Singh
04 Mar 2015
Expert Systems with Applications | VOL. 42

An Integrated Novel Framework for Coping Missing Values Imputation and Classification
Monalisa Jena ... Satchidananda Dehuri
IEEE Access | VOL. 10
Monalisa Jena, et. al.Monalisa Jena ... Satchidananda Dehuri
01 Jan 2021
IEEE Access | VOL. 10

A Perspective of Missing Value Imputation Approaches
Wajeeha Rashid ... Manoj Kumar Gupta
-
Wajeeha Rashid, et. al.Wajeeha Rashid ... Manoj Kumar Gupta
19 Jun 2020
19 Jun 2020

Business Intelligence Techniques for Missing Data Imputations
...
-
, et. al. ...
02 Nov 2015
02 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Imputation-Based Learning Models for Prediction of Diabetes

Abstract

Talk to us

Similar Papers