Abstract

Gestational diabetes mellitus (GDM) is a type of diabetes that usually resolves at the end of the pregnancy but exposes to a higher risk of developing type 2 diabetes mellitus (T2DM). This study aimed to unravel the factors, among those that quantify specific metabolic processes, which determine progression to T2DM by using machine-learning techniques. Classification of women who did progress to T2DM (labeled as PROG, n = 19) vs. those who did not (labeled as NON-PROG, n = 59) progress to T2DM has been performed by using Orange software through a data analysis procedure on a generated data set including anthropometric data and a total of 34 features, extracted through mathematical modeling/methods procedures. Feature selection has been performed through decision tree algorithm and then Naïve Bayes and penalized (L2) logistic regression were used to evaluate the ability of the selected features to solve the classification problem. Performance has been evaluated in terms of area under the operating receiver characteristics (AUC), classification accuracy (CA), precision, sensitivity, specificity, and F1. Feature selection provided six features, and based on them, classification was performed as follows: AUC of 0.795, 0.831, and 0.884; CA of 0.827, 0.813, and 0.840; precision of 0.830, 0.854, and 0.834; sensitivity of 0.827, 0.813, and 0.840; specificity of 0.700, 0.821, and 0.662; and F1 of 0.828, 0.824, and 0.836 for tree algorithm, Naïve Bayes, and penalized logistic regression, respectively. Fasting glucose, age, and body mass index together with features describing insulin action and secretion may predict the development of T2DM in women with a history of GDM.

Highlights

  • Diabetes is a chronic metabolic disease characterized by the presence of high levels of glucose in the blood

  • According to the most recent definition, gestational diabetes mellitus (GDM) is defined as a diabetes diagnosed in the second or third trimester of pregnancy that was not clearly overt diabetes prior to gestation (American Diabetes Association, 2020); it usually resolves at the end of the pregnancy, women who experienced GDM are known to have a higher risk of developing type 2 diabetes mellitus (T2DM) later in their life (American Diabetes Association, 2020)

  • With respect to the generated data set given as input to the data analysis procedure, three outliers have been removed (1 NON-PROG and 2 PROG), resulting in a total of 75 cases; comparing the characteristics of NON-PROG and PROG, 25 out of 34 characteristics have been found statistically different

Read more

Summary

Introduction

Diabetes is a chronic metabolic disease characterized by the presence of high levels of glucose in the blood (i.e., hyperglycemia). Several pathogenic processes can be at the basis of diabetes development leading to the identification of different diabetes categories, namely, type 1 diabetes mellitus (T1DM), type 2 diabetes mellitus (T2DM), and gestational diabetes mellitus (GDM; American Diabetes Association, 2020). The application of machine-learning techniques to this field has been done on a wide variety of data and has been aimed at different purposes, for example, early diagnosis (Perveen et al, 2016; Zheng and Zhang, 2017; El_Jerjawi and Abu-Naser, 2018; Sarwar et al, 2018; Zou et al, 2018; Bernardini et al, 2020; Garcia-Carretero et al, 2021), estimation of T2DM risk (Dalakleidi et al, 2017; Talaei-Khoei and Wilson, 2018; Garcia-Carretero et al, 2020), detection of subjects in the general population affected by T2DM or prediabetes (Yu et al, 2010), T2DM characterization and classification (Maniruzzaman et al, 2017; Bernardini et al, 2019), and T2DM care (Huang et al, 2007)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call