Abstract Study question Can an AI image-based model be trained to predict whether an endometrium, prior to progesterone, is receptive for successful embryo implantation from ultrasound images? Summary answer An endometrial receptivity AI model, utilizing ultrasound images and clinical features, is predictive of successful implantation and outperforms endometrial thickness as a predictor of implantation. What is known already Despite its importance in successful embryo implantation and subsequent live birth, endometrial receptivity is difficult to measure. The standard of care at some clinics is to cancel an embryo transfer if EMT is measuring less than 7 mm on ultrasound assessment – however, achieving an EMT threshold does not guarantee success, even with euploid blastocysts. Currently available receptivity tests (e.g. endometrial receptivity assays) are unreliable and have not shown significant improvements in reproductive outcomes, necessitating the development of new evaluation methods for endometrial receptivity. AI can potentially address this gap in clinical care and understanding, aiming to increase patient success. Study design, size, duration 79,602 ultrasound images (40,910 patients) of the endometrium, on the day of progesterone start according to patient’s frozen embryo transfer cycle, were retrospectively collected from 4 clinic networks including 70 clinic locations. PGT-A outcomes and blastocyst quality were available for 15,466 and 36,070 blastocysts respectively, with 14,538 having labels for both. Different image samples (based on embryo quality) were selected to train and test several AI models, with implantation (positive beta-hCG) as a binary outcome. Participants/materials, setting, methods Three criteria for negative (implantation) class sample selection were tested for receptivity model development: 1) high- or medium-quality and/or euploid blastocysts; 2) high-quality blastocysts; 3) all blastocysts. Models were ensembles consisting of an image-based deep learning (DL)-model and a feature-based machine learning (ML)-model. Model performance was measured by Area-under-the-curve (AUC), sensitivity, and specificity. Feature importance from the ML model was assessed, while receptivity model performance was compared to the power of EMT to predict implantation. Main results and the role of chance A receptivity model trained on scenario 1) (n = 27,424) achieved AUC 0.613, sensitivity 0.716, and specificity 0.439 on a test set of 9197 samples, in predicting implantation. Another receptivity model trained on scenario 2) (n = 25,625) achieved AUC 0.627, sensitivity 0.637, and specificity 0.543 on a test set of 8642. Whereas a third receptivity model trained on scenario 3) (n = 31,238) was selected for further analysis as it attained the best performance with AUC 0.631, sensitivity 0.628, and specificity 0.556 on a test set of 10,422. The ensemble model also utilizes relevant clinical features, which were ranked by importance to model predictions: EMT; progesterone blood test value; age at transfer; previous total embryo transfers; days between ultrasound date and transfer date; age at retrieval; and oocyte origin. EMT was assessed for its ability to predict implantation at thresholds between 5–20 mm. The threshold that best separates positive and negative implantation was determined by evaluating AUC on a validation set of 5682. A threshold of 8.8 mm achieved an AUC 0.576, sensitivity 0.694, and specificity 0.455 in predicting implantation on a test set of 5686. However, the receptivity model AUC (0.631) was significantly higher (p < 0.001; DeLong test for differences in AUC), outperforming EMT predictability. Limitations, reasons for caution The endometrial receptivity model was built and tested on retrospective frozen embryo transfer data. Results should be corroborated in prospective and non-selection studies, as well as in multiple geographical locations (all clinics included reside in the United States). The dataset requires greater diversification, with a greater representation of EMT (<7mm). Wider implications of the findings This study indicates that an AI model can be built to predict implantation from ultrasound images and clinical features, surpassing current standards - EMT alone. Further experimentation, with additional clinical features, may improve performance to facilitate development of a model that can accurately predict the gold-standard of live birth. Trial registration number Not applicable
Read full abstract