BackgroundFor women who have experienced recurrent pregnancy loss (RPL), it is crucial not only to treat them but also to evaluate the risk of recurrence. The study aimed to develop a risk predictive model to predict the subsequent early pregnancy loss (EPL) in women with RPL based on preconception data.MethodsA prospective, dynamic population cohort study was carried out at the Second Hospital of Lanzhou University. From September 2019 to December 2022, a total of 1050 non-pregnant women with RPL were participated. By December 2023, 605 women had subsequent pregnancy outcomes and were randomly divided into training and validation group by 3:1 ratio. In the training group, univariable screening was performed on RPL patients with subsequent EPL outcome. The least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression were utilized to select variables, respectively. Subsequent EPL prediction model was constructed using generalize linear model (GLM), gradient boosting machine (GBM), random forest (RF), and deep learning (DP). The variables selected by LASSO regression and multivariate logistic regression were then established and compared using the best prediction model. The AUC, calibration curve, and decision curve (DCA) were performed to assess the prediction performances of the best model. The best model was validated using the validation group. Finally, a nomogram was established based on the best predictive features.ResultsIn the training group, the GBM model achieved the best performance with the highest AUC (0.805). The AUC between the variables screened by the LASSO regression (16-variables) and logistic regression (9-variables) models showed no significant difference (AUC: 0.805 vs. 0.777, P = 0.1498). Meanwhile, the 9-variable model displayed a well discrimination performance in the validation group, with an AUC value of 0.781 (95%CI 0.702, 0.843). The DCA showed the model performed well and was feasible for making beneficial clinical decisions. Calibration curves revealed the goodness of fit between the predicted values by the model and the actual values, the Hosmer–Lemeshow test was 7.427, and P = 0.505.ConclusionsPredicting subsequent EPL in RPL patients using the GBM model has important clinical implications. Future prospective studies are needed to verify the clinical applicability.Trial registrationThis study was registered in the Chinese Clinical Trial Registry with the registration number of ChiCTR2000039414 (27/10/2020).