Abstract Study question Can an AI model be applied to datasets with different characteristics? Summary answer The AI model predicting pregnancy developed on microscope images can be generally applied to multifocal images from time-lapse system. What is known already AI models require large datasets to be able to generalize and handle various scenarios, but it can be challenging to gather enough data for each case. Embryologists evaluate embryos using multiple focal planes and add color filters as needed, and timelapse images have varying features. AI models in past studies were trained using both microscopic and timelapse images, and embryologists question if images must be captured at a specific focal plane for AI to work effectively. In this study, the AI model trained on over 2,000 microscopic embryonic images was validated using timelapse images taken at multiple focal planes. Study design, size, duration We collected 2,555 microscopic images from 7 IVF clinics and 1299 timelapse images from 433 embryos from a single IVF clinic between July 2016 and December 2020. The timelapse images were divided into 3 groups, with Group 1 being the best visualized ICM images, Group 2 being 20 µm higher or lower than Group 1, and Group 3 being 20 µm higher or lower than Group 2. Participants/materials, setting, methods We built 2 CNN models. The “Microscopic model” and “Timelapse model” were trained and validated using 3-fold cross validation with 2,555 microscopic images, 433 timelapse Group 1 images, respectively. To examine whether the Timelapse model was able to infer well for images taken at different focal points, Group 1, 2, and 3 images were used to test the Microscopic model. Main results and the role of chance The AUROCs and accuracies in mean (SD) for the Microscopic model were 0.738 (0.003) and 0.705 (0.011) after 3-fold cross-validation. The AUROC and accuracy for the Timelapse model were 0.627 and 0.583. The AUROCs for Group 1,2, and 3 were 0.699 (0.014), 0.705 (0.001) and 0.701 (0.064), respectively. The accuracies for Group 1,2, and 3 were 0.655. (0.013), 0.647 (0.007), and 0.651 (0.007). The predictive power of the Microscopic model applied to the time-lapse images was better than that of the Timelapse model, although it was not as accurate as the Microscopic model applied to the microscope images. Limitations, reasons for caution The limitations of the study include its retrospective nature and a small dataset. Transfer learning of timelapse images to a microscopic model may be beneficial for analyzing timelapse images. Wider implications of the findings The study found that timelapse images with multifocal planes can be applied to an AI model built based on a large dataset of microscopic images collected from multiple centers. It may be prudent for a small IVF clinic to apply a generalizable model rather than building its own. Trial registration number not applicable
Read full abstract