Prediction of large-for-gestational age infants in relation to hyperglycemia in pregnancy – A comparison of statistical models

Kristen S Gibbons,Allan M.Z Chang,Ronald C.W Ma,Wing Hung Tam,Patrick M Catalano,David A Sacks,Julia Lowe,H David Mcintyre

doi:10.1016/j.diabres.2021.108975

Abstract

AimsUsing data from a large multi-centre cohort, we aimed to create a risk prediction model for large-for-gestational age (LGA) infants, using both logistic regression and naïve Bayes approaches, and compare the utility of these two approaches. MethodsWe have compared the two techniques underpinning machine learning: logistic regression (LR) and naïve Bayes (NB) in terms of their ability to predict large-for-gestational age (LGA) infants. Using data from five centres involved in the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) study, we developed LR and NB models and compared the predictive ability and stability between the models. Models were developed combining the risks of hyperglycaemia (assessed in three forms: IADPSG GDM yes/no, GDM subtype, OGTT z-score quintiles), demographic and clinical variables as potential predictors. ResultsThe two approaches resulted in similar estimates of LGA risk (intraclass correlation coefficient 0.955, 95% CI 0.952, 0.958) however the AUROC for the LR model was significantly higher (0.698 vs 0.682; p < 0.001). When comparing the three LR models, use of individual OGTT z-score quintiles resulted in statistically higher AUROCs than the other two models. ConclusionsLogistic regression can be used with confidence to assess the relationship between clinical and biochemical variables and outcome.

Full Text