BackgroundNo fetal growth standard is currently endorsed for universal use in the US. Newer standards improve upon the methodologic limitations of older studies but before adopting into practice, it is important to know how recent standards perform at identifying fetal under- or over- growth and predicting subsequent neonatal morbidity or mortality in US populations. ObjectiveTo compare classification of estimated fetal weight (EFW) that is <5th or 10th percentile or >90th percentile by six population-based fetal growth standards and the ability of these standards to predict a composite of neonatal morbidity and mortality. Study DesignWe used data from the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to be (nuMoM2b) cohort, which recruited nulliparous women in the first trimester at eight U.S. clinical centers (2010-2014). EFW was obtained from ultrasounds at 16-21 and 22-29 weeks (n=9530 women). We calculated rates of fetal growth restriction (EFW <5th and 10th percentiles; FGR<5 and FGR<10) and EFW >90th percentile (EFW>90) from three large prospective fetal growth cohorts with similar rigorous methodologies: INTERGROWTH-21, World Health Organization (WHO) sex-specific and combined, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) race/ethnic-specific and unified, and the historical Hadlock (1991) reference. To determine whether differential classification of FGR or EFW>90 among standards was clinically meaningful, we then compared area under the curve (AUC) and sensitivity of each standard to predict SGA or LGA at birth, composite perinatal morbidity and mortality alone, and SGA or LGA with composite perinatal morbidity and mortality. ResultsThe standards classified different proportions of FGR and EFW>90 for ultrasounds at 16-21 (visit 2) and 22-29 (visit 3) weeks. At visit 2, the NICHD race/ethnic-specific, WHO sex-specific and WHO-combined identified similar rates of FGR<10 (8.4-8.5%) with the other two having lower rates, while NICHD race/ethnic-specific identified the highest rate of FGR<5 (5.0%) compared to the other references. At visit 3, WHO-sex-specific classified 9.2% of fetuses as FGR<10, while the other five classified a lower proportion: WHO-Combined 8.4%, NICHD-race/ethnic-specific 7.7%, INTERGROWTH 6.2%, Hadlock 6.1%, and NICHD unified 5.1%. INTERGROWTH classified the highest (21.3%) as EFW>90 while Hadlock classified the lowest (8.3%). When predicting composite perinatal morbidity and mortality in the setting of early-onset FGR, WHO had the highest AUC of 0.53 (0.51, 0.53) for FGR<10 at 22-29 weeks, but the AUCs were similar among standards (0.52). Sensitivity was generally low across standards (22.7-29.1%). When predicting SGA-BW with composite neonatal morbidity or mortality, for FGR<10 at 22-29 weeks, WHO-sex-specific had the highest AUC (0.64; 95% CI 0.60, 0.67) and INTERGROWTH had the lowest (AUC=0.58; 95% CI 0.55, 0.62), though all standards had low sensitivity (7.0-9.6%). ConclusionsDespite classifying different proportions of fetuses as FGR or EFW>90, all standards performed similarly in predicting perinatal morbidity and mortality. Classification of different percentages of fetuses as FGR or EFW>90 among references may have clinical implications in the management of pregnancies, such as increased antenatal monitoring for FGR or cesarean for suspected LGA. Our findings highlight the importance of knowing how standards perform in local populations, but more research is needed to determine if any standard performs better at identifying risk of morbidity or mortality.