Abstract

© 2019 Institute of Electrical and Electronics Engineers Inc.. All rights reserved. As a widely known chronic disease, diabetes mellitus is called a silent killer. It makes the body produce less insulin and causes increased blood sugar, which leads to many complications and affects the normal functioning of various organs, such as eyes, kidneys, and nerves. Although diabetes has attracted high attention in research, due to the existence of missing values and class imbalance in the data, the overall performance of diabetes classification using machine learning is relatively low. In this paper, we propose an effective Prediction algorithm for Diabetes Mellitus classification on Imbalanced data with Missing values (DMP_MI). First, the missing values are compensated by the Naive Bayes (NB) method for data normalization. Then, an adaptive synthetic sampling method (ADASYN) is adopted to reduce the influence of class imbalance on the prediction performance. Finally, a random forest (RF) classifier is used to generate predictions and evaluated using comprehensive set of evaluation indicators. Experiments performed on Pima Indians diabetes dataset from the University of California at Irvine, Irvine (UCI) Repository, have demonstrated the effectiveness and superiority of our proposed DMP_MI.

Highlights

  • Pancreas is a most important organ of human body, its produced insulin has an effect on the metabolism of sugar, fat and protein for daily life energy

  • This paper focuses on how to achieve a good performance for diabetes classification

  • To the best of our knowledge, Karegowda et al [20] have achieved the best accuracy of 84.7%, through using a hybrid model, which integrated genetic algorithm (GA) and back propagation network (BPN)

Read more

Summary

INTRODUCTION

Pancreas is a most important organ of human body, its produced insulin has an effect on the metabolism of sugar, fat and protein for daily life energy. As health care industry develops and generates a mass of useful data such as patient information, electronic medical records, diagnosis and treatment data, and etc., this can serve as a key resource for knowledge extraction that can support decision making and cost reduction. Q. Wang et al.: DMP_MI: Effective Diabetes Mellitus Classification Algorithm make predictions [8]. The ensemble approach, by combining machine learning algorithms, is proposed to increase the performance and accuracy of diabetes analysis and prediction [10]. This paper focuses on how to achieve a good performance for diabetes classification. The DMP_MI algorithm described here has achieved 87.10% classification accuracy on real diabetes dataset, which outperforms many other algorithm.

RELATED WORK
PERFORMANCE EVALUATION
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.