Preeclampsia is a potentially life-threatening pregnancy complication. Among women whose pregnancies are complicated by preeclampsia, the Preeclampsia Integrated Estimate of RiSk (PIERS) models (i.e., the PIERS Machine Learning [PIERS-ML] model, and the logistic regression-based fullPIERS model) accurately identify individuals at greatest or least risk of adverse maternal outcomes within 48 h following admission. Both models were developed and validated to be used as part of initial assessment. In the United Kingdom, the National Institute for Health and Care Excellence (NICE) recommends repeated use of such static models for ongoing assessment beyond the first 48 h. This study evaluated the models' performance during such consecutive prediction. This multicountry prospective study used data of 8,843 women (32% white, 30% black, and 26% Asian) with a median age of 31 years. These women, admitted to maternity units in the Americas, sub-Saharan Africa, South Asia, Europe, and Oceania, were diagnosed with preeclampsia at a median gestational age of 35.79 weeks between year 2003 and 2016. The risk differentiation performance of the PIERS-ML and fullPIERS models were assessed for each day within a 2-week post-admission window. The PIERS adverse maternal outcome includes one or more of: death, end-organ complication (cardiorespiratory, renal, hepatic, etc.), or uteroplacental dysfunction (e.g., placental abruption). The main outcome measures were: trajectories of mean risk of each of the uncomplicated course and adverse outcome groups; daily area under the precision-recall curve (AUC-PRC); potential clinical impact (i.e., net benefit in decision curve analysis); dynamic shifts of multiple risk groups; and daily likelihood ratios. In the 2 weeks window, the number of daily outcome events decreased from over 200 to around 10. For both PIERS-ML and fullPIERS models, we observed consistently higher mean risk in the adverse outcome (versus uncomplicated course) group. The AUC-PRC values (0.2-0.4) of the fullPIERS model remained low (i.e., close to the daily fraction of adverse outcomes, indicating low discriminative capacity). The PIERS-ML model's AUC-PRC peaked on day 0 (0.65), and notably decreased thereafter. When categorizing women into multiple risk groups, the PIERS-ML model generally showed good rule-in capacity for the "very high" risk group, with positive likelihood ratio values ranging from 70.99 to infinity, and good rule-out capacity for the "very low" risk group where most negative likelihood ratio values were 0. However, performance declined notably for other risk groups beyond 48 h. Decision curve analysis revealed a diminishing advantage for treatment guided by both models over time. The main limitation of this study is that the baseline performance of the PIERS-ML model was assessed on its development data; however, its baseline performance has also undergone external evaluation. In this study, we have evaluated the performance of the fullPIERS and PIERS-ML models for consecutive prediction. We observed deteriorating performance of both models over time. We recommend using the models for consecutive prediction with greater caution and interpreting predictions with increasing uncertainty as the pregnancy progresses. For clinical practice, models should be adapted to retain accuracy when deployed serially. The performance of future models can be compared with the results of this study to quantify their added value.
Read full abstract