Abstract
Background: Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection capabilities and often rely on costly and less accessible indicators, making them less applicable in resource-limited settings. Objective: To develop and evaluate prediction models for late-onset preeclampsia using general information, maternal risk factors, and laboratory indicators from early gestation (6–13 weeks). Methods: A dataset of 2000 pregnancies, including 110 late-onset preeclampsia cases, was analyzed. General information and maternal risk factors were collected from the hospital information system. Relevant laboratory indicators between 6 and 13 weeks of gestation were examined. Logistic regression was used as the baseline model to assess the predictive performance of the support vector machine and extreme gradient boosting models for late-onset preeclampsia. Results: The logistic regression model, only considering general information and risk factors, identified 19.1% of cases, with a false positive rate of 0.4%. When selecting 15 factors encompassing general information, risk factors, and laboratory indicators, the false positive rate increased to 0.7% and the detection rate improved to 27.3%. The support vector machine model, only considering general information and risk factors, achieved a detection rate of 27.3%, with a false positive rate of 0.0%. After including all the laboratory indicators, the false positive rate increased to 7.7% but the detection rate significantly improved to 54.5%. The extreme gradient boosting model, only considering general information and risk factors, achieved a detection rate of 31.6%, with a false positive rate of 1.5%. After including all the laboratory indicators, the false positive rate remained at 0.7% but the detection rate increased to 52.6%. Additionally, after adding the laboratory indicators, the areas under the ROC curve for the logistic regression, support vector machine, and extreme gradient boosting models were 0.877, 0.839, and 0.842, respectively. Conclusion: Compared with the logistic regression model, both the support vector machine and extreme gradient boosting models significantly improved the detection rates for late-onset preeclampsia. However, the support vector machine model had a comparatively higher false positive rate. Notably, the logistic regression and extreme gradient boosting models exhibited high negative predictive values of 99.3%, underscoring their effectiveness in accurately identifying pregnant women less likely to develop late-onset preeclampsia. Additionally, logistic regression showed the highest areas under the ROC curve, suggesting that the traditional model has unique advantages in relation to prediction.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have