Abstract Study question How does the novel automated scoring system in our laboratory correlate with treatment success in different subpopulations? Summary answer The web app automates embryo assessment, linking scores to morphology, development speed, ploidy, and clinical outcomes in treatments with both patient and donor oocytes. What is known already In the realm of assisted reproductive technologies, numerous platforms employing artificial intelligence (AI) have emerged for embryo assessment. Substantial research has been dedicated to evaluating the efficacy of AI in predicting pregnancy outcomes, particularly following blastocyst transfer. However, not all available options have undergone external validation, a crucial step deemed necessary before investing in such tools. This project endeavors to validate the utility of EMBRYOAID v1.0, a straightforward web application capable of assessing embryos through photos and/or videos. Study design, size, duration This is a retrospective cohort study including 485 patients who underwent IVF treatments with embryos (n = 5,710) cultured in EmbryoScope® time-lapse systems. Senior embryologists routinely assessed blastocysts using ASEBIR morphological criteria (A-D). Then, time-lapse videos were scored by the EMBRYOAID algorithm (0-10). We investigated the relationship between automated scoring and: (I) conventional morphology, (II) morphokinetic parameters, (III) euploidy rate, and (IV) implantation in cycles with own oocytes, egg donation, and preimplantation genetic testing for aneuploidies (PGT-A). Participants/materials, setting, methods The association with morphology and morphokinetics was investigated in 5,710 embryos. Ploidy correlation involved 258 blastocysts undergoing trophectoderm biopsy and next-generation sequencing analysis. Implantation association was studied in 381 single blastocyst transfers. Finally, a multivariable analysis considered oocyte age, body mass index, type of transfer (fresh-frozen) and day of transfer (5-6) to quantify the relevance of automated scoring in different treatment subpopulations. Model performance was assessed through ROC curves and area under the curve calculations. Main results and the role of chance The results that involved grouping were quartiled based on sample size, calculated by the statistical software. This deep-learning score was related to conventional morphology (7.9±1 for A, n = 295; 6.9±1.3 for B n = 1168; 5.9±1.3 for C, n = 830; and 3.1±1.8 for D, n = 3,417)*. Embryos with faster development in early stages received higher scores, considering division times at 2, 3, 4, and 5 cells*. Euploidy rate increased as the embryo score was higher: 39.1% for ≤5.6 (n = 54), 45.5% for 5.6-6.2 (n = 61), 48.5% for 6.2-6.9 (n = 65) and 58.2% for >6.9 (n = 78). Similarly, the implantation rate was also higher in accordance with the score: 42.7% for ≤6.2 (n = 73), 50.6% for 6.2-7.4 (n = 87), 60.2% for 7.4-8.2 (n = 103) and 69.4% for >8.2 (n = 118). The stepwise multivariate analysis showed that embryo score was related to the odds of implantation in conventional treatments with patient oocytes (OR = 1.4; 95% CI [1.1-1.8]*) and in the oocyte donation program (OR = 1.2; 95% CI [1–1.4]*). The AUCs demonstrated the following performance: 0.643 (0.559-0.727) for patient oocytes and 0.677 (0,624-0,731) for the oocyte donation program. There was no significant association of embryo score with implantation in PGT-A treatments. Limitations, reasons for caution The retrospective nature of the study, despite serving as a thorough external validation, introduces inherent limitations. Notably, in PGT-A treatments, embryos underwent assisted hatching on day 3, potentially influencing the model’s performance. Wider implications of the findings The implications of our findings validate a user-friendly tool for automated embryo scoring in the laboratory. The achieved performance is comparable or superior to other tested models, particularly in egg donation programs, potentially encouraging clinics to adopt this tool routinely. Trial registration number Not applicable
Read full abstract