Prediction of gestational diabetes mellitus at the first trimester: machine-learning algorithms.

Yi-Xin Li,Mei Wang,Yi-Chen Liu,Yu-Li Huang

doi:10.1007/s00404-023-07131-4

Abstract

Short- and long-term complications of gestational diabetes mellitus (GDM) involving pregnancies and offspring warrant the development of an effective individualized risk prediction model to reduce and prevent GDM together with its associated co-morbidities. The aim is to use machine learning (ML) algorithms to study data gathered throughout the first trimester in order to predict GDM. Two independent cohorts with forty-five features gathered through first trimester were included. We constructed prediction models based on three different algorithms and traditional logistic regression, and deployed additional two ensemble algorithms to identify the importance of individual features. 4799 and 2795 pregnancies were included in the Xinhua Hospital Chongming branch (XHCM) and the Shanghai Pudong New Area People's Hospital (SPNPH) cohorts, respectively. Extreme gradient boosting (XGBoost) predicted GDM with moderate performance (the area under the receiver operating curve (AUC) = 0.75) at pregnancy initiation and good-to-excellent performance (AUC = 0.99) at the end of the first trimester in the XHCM cohort. The trained XGBoost showed moderate performance in the SPNPH cohort (AUC = 0.83). The top predictive features for GDM diagnosis were pre-pregnancy BMI and maternal abdominal circumference at pregnancy initiation, and FPG and HbA1c at the end of the first trimester. Our work demonstrated that ML models based on the data gathered throughout the first trimester achieved moderate performance in the external validation cohort.

Full Text