Objectives: Small-for-gestational-age fetuses (SGA) are at high risk of intrapartum fetal compromise requiring operative delivery. In a recent study, we developed a model using a combination of three antenatal (gestational age at delivery, parity, cerebroplacental ratio) and three intrapartum (epidural use, labor induction and augmentation using oxytocin) variables for the prediction of operative delivery due to presumed fetal compromise in SGA fetuses – the Individual RIsk aSsessment (IRIS) prediction model. The aim of this study was to test the predictive accuracy of the IRIS prediction model in an external cohort of singleton pregnancies complicated by SGA.Methods: This was an external validation study using a cohort of pregnancies from two tertiary referral centers in Spain and England. The inclusion criteria were singleton pregnancies diagnosed with an SGA fetus, defined as estimated fetal weight (EFW) below the 10th centile for gestational age at 36 weeks or beyond, which had fetal Doppler assessment and available data on their intrapartum care and pregnancy outcomes. The main outcome in this study was the operative delivery for presumed fetal compromise. External validation was performed using the coefficients obtained in the original development cohort. The predictive accuracies of models were investigated with receiver operating characteristics (ROC) curves. The Hosmer–Lemeshow test was used to test the goodness-of-fit of models and calibration plots were also obtained for visual assessment. A mobile application using the combined model algorithm was developed to facilitate clinical use.Results: Four hundred twelve singleton pregnancies with an antenatal diagnosis of SGA were included in the study. The operative delivery rate was 22.8% (n = 94). The group which required operative delivery for presumed fetal compromise had significantly fewer multiparous women (19.1 versus 47.8%, p < .001 in the total study population; 19.0 versus 43.5 and 19.2 versus 49.6%, UK and Spain cohort, respectively), lower cerebroplacental ratio (CPR) multiples of median (MoM) (median: 0.77 versus 0.92, p < .001 in the total study population; 0.77 versus 0.92 and 0.77 versus 0.92, UK and Spain cohort, respectively), more inductions of labor (74.5 versus 60.1%, p = .010 in the total study population; 85.7 versus 77.2 and 71.2% and 53.1, UK and Spain cohort, respectively) and more use of oxytocin augmentation (57.4 versus 39.3%, p = .002 in the total study population; 19.0 versus 12.0 and 68.5 and 50.4%, UK and Spain cohort, respectively) compared to those who did not require operative delivery due to presumed fetal compromise. When the original antenatal model was applied to the present cohort, we observed moderate predictive accuracy (AUC: 0.70, 95% CI: 0.64–0.76), and no signs of poor fit (p = .464). The original combined model, when applied to the external cohort, had moderate predictive accuracy (AUC: 0.72, 95% CI: 0.67–0.77) and also no signs of poor fit (p = .268) without the need for refitting. A statistically significant increase in the predictive accuracy was not achieved via refitting of the combined model (AUC 0.76 versus 0.72, p = .060).Conclusions: Using our recently published model, the predictive accuracy for fetal compromise requiring operative delivery in term fetuses thought to be SGA was modest and showed no signs of poor fit in an external cohort. The IRIS tool for mobile devices has been developed to facilitate wide clinical use of this prediction model.Brief rationaleObjective: To determine the external validity of an intrapartum risk prediction model for suspected small-for-gestational age fetuses.What is already known: Small-for-gestational age fetuses are at increased risk of intrapartum compromise. Fetal weight alone is a poor marker for adverse outcomes and a comprehensive prediction model has been previously suggested.What this study adds: Multivariable prediction model showed good accuracy and calibration in this external validation study. The significance of some variables was different between the original and external validation cohort and there was a small margin for improvement with model refitting. A mobile application has been developed to facilitate clinical use.