Abstract

To predict diabetes mellitus model data mining (DM) based approaches on the dataset collected from the seven northwestern states of Nigeria. Data were collected from both primary and secondary sources through questionnaires and verbal interviews from patients with diabetic mellitus and other chronic diseases. Some hospital data were also used from the records of patients involved in this work. The dataset comprises 281 instances with 8 attributes. R programming software (version 5.3.1) was used in the experiments. The DM techniques used in this research were binomial logistic regression, classification, confusion matrix and correlation coefficient. The data were partitioned into training and testing sets. Training data were used in building the model while testing data were used to validate the model. The algorithm for the best-fitted model converges with null deviance: 281.951, residual deviance: 16.476 and AIC: 30.476. The significance variables are AGE, GLU, DBP and KDYP with 0.025, 0.01, 0.05 and 0.025 P values, respectively. The predicted model accounted for the accuracy of ∼97.1%. The correlation analysis results revealed that diabetic patients are more likely to be hypertensive than patients with other chronic diseases considered in the research.

Highlights

  • The recent developments in biotechnology and health sciences have led to a significant production of data, such as clinical information, generated from large Electronic Health Records

  • The attributes were abbreviated as; diabetes mellitus patient’s (TYPE), patient’s age (AGE), patient’s glucose level (GLU), patient’s diastolic blood pressure (DBP), a patient’s body mass index (BMI), Symptoms related to kidney problems (KDYP), Symptoms related to heart/cardiovascular problems (HETP) and Symptoms related to eye problems (EYEP)

  • A confusion matrix is appraised of correct classifications, a 2 × 2 square matrix consists of true positive (TP), true negative (TN), false positive (FP) and false negative (FN)

Read more

Summary

Introduction

The recent developments in biotechnology and health sciences have led to a significant production of data, such as clinical information, generated from large Electronic Health Records. In 2008, the benchmark for diabetes studies [14] was conducted athwart some selected Health centres in Nigeria, with objectives, clinical and laboratory profile evaluating the eminence of care of Nigerians diabetics with a view to planning and improving diabetes care. Another related study was carried out in northwestern Nigeria to assess diabetic patients’ compliance of the management, including Socio-demographic factors influencing their conformity [15]. A research was intended to be carried out to predict diabetes mellitus models about other chronic diseases using DM-based approaches, and northwestern part of Nigeria as the case study

Material and methods
Proposed analytical platforms for the model
Sampling techniques used in the process of data collection
Data collection
Attribute information
Variable selections
Binomial logistic regression model
Correlation coefficient
Classification accuracy
Confusion matrix
R programming software
Results
Correlation analysis
Conclusion
Ethical approval
10 References

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.