Abstract

Diabetes is a long-term disease. Inappropriate blood sugar level control in diabetic patients can lead to serious issues like kidney and heart diseases. Obesity is widely regarded as a major risk factor for type 2 diabetes. In this research, a model proposed to predict diabetic obese patients based on Expectation Maximization, PCA, and SMOTE Algorithms in the preprocessing and feature extraction phases, and using Fuzzy KNN classifier in the prediction phase. The model applied on real dataset and the accuracy of prediction results reflects the positive effect of the preprocessing techniques. The accuracy of the proposed model is 95.97% and outperforms other model applied on the same dataset.

Highlights

  • Using Data Mining (DM) and Machine Learning (ML) techniques in data mining research are a common way for making use of large amounts of available knowledge-based data

  • Principle Component Analysis (PCA) is an extracting features statistical approach that employs to turn a set of possibly associated annotations to a set of variables uncorrelated transformed linearly known as principle components

  • We report the findings obtained when the fuzzy K-Nearest Neighbor (KNN) classifier used with the proposed model on the dataset described, and applying the fuzzy KNN classifier on the raw data of the dataset

Read more

Summary

INTRODUCTION

Using Data Mining (DM) and Machine Learning (ML) techniques in data mining research are a common way for making use of large amounts of available knowledge-based data. Machine learning and data mining approaches are unquestionably of great importance for aspects of clinical administration, diagnosis, and treatment As part of this work, challenges were undertaken to examine the recent literature on ML and DM methodologies in many diseases especially in the diseases of the chronic diabetes. Type-2 diabetes mellitus has been postulated as a primary cause of NAFLD development, or nonalcoholic steatohepatitis, which likely reflects in Type-2 diabetes mellitus with rapid advancement of weight gain and resistance of insulin. Obesity and diabetes, both multifactorial, difficult illnesses, have become major public health issues across the world [6].

RELATED WORK
PROPOSED SOLUTION AND DATASET
Expectation Maximization Algorithm for Estimating the Missing Values
Feature Reduction using Principal Component Analysis Algorithm
Handling the Un-Balanced Data using SMOTE Algorithm
Fuzzy KNN Classifier
Dataset Description
Proposed Model
Evaluation Method
Results
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call