Abstract Study question Can calibrated AI ploidy screening test results provide reliable, biologically-justified estimates of embryo euploidy? Summary answer The AI ploidy screening test, rooted in the embryo’s morphokinetic profile and clinical metadata, provides reliable probabilistic estimates of euploidy. What is known already Published ploidy algorithms typically provide a binary classification of embryo ploidy (aneuploidy/euploidy). Algorithmic outputs are thresholded; a value above/below the threshold indicates a euploid/aneuploid label, respectively, and the confidence in the label prediction is tested. Experience shows, however, that decision-makers have difficulty interpreting how well these algorithm predictions match the true prevalence of euploidy in their clinics, especially when taking into consideration patient age and embryo quality. An AI embryo ploidy screening tool that uses biologically-relevant inputs to provide reliable euploidy probabilities is needed. Study design, size, duration The AI tool was trained on 5,000 time-lapse video sequences, along with associated clinical parameters: biological age at time of retrieval, Day-5 embryo quality, and morphokinetic parameters ranging from time of pronuclear fading to time to blastulation. Probability calibration was applied and its performance was evaluated using a blind test dataset (N = 708 embryos; euploid=352; aneuploid=356) with known prevalences of euploidy, aneuploidy, and live-birth. Mean ± SD patient age: 35.9 ± 5.4 years. Participants/materials, setting, methods The AI ploidy screening tool used known embryo ploidy status as ground-truth labels; biological age and visual quality parameters were incorporated as continuous input features to maintain biological validity. Reliability curve analysis, which plots the observed frequencies of euploidy in the clinical input data (y-axis) against the predicted probability frequencies by the screening test (x-axis), was used to assess model confidence. Odds ratios (OR) confirmed significance between associations. Main results and the role of chance Logistic regression analysis shows that AI scores are robustly associated with euploidy probability (OR = 2.79 [95% CI = 2.04-3.81] at a threshold of 0.5 when comparing euploid likelihood for high-versus-low AI scores). For embryo cohorts in the blind test set containing ≥1 aneuploid and euploid embryos in the test dataset, (N = 57 cohorts), the highest AI ranked embryo was euploid in 64% of the cohorts. Embryos were divided into four predefined brackets according to their scores (1-32, 33-49, 50-66, 67-99) and euploidy rate per bracket was determined: 28%, 44%, 58%, 71%, respectively. There was a linear association between ascending AI scores and percentage of euploid embryos, with the highest level of model confidence achieved at the tail ends of the scalar; embryos with a score above 66 were 2.5X more likely to be euploid than an embryo with a score below 33. Limitations, reasons for caution This analysis used historical time-lapse sequences. Moving forward, we must evaluate prospective AI use for ploidy screening. Genetic status in utero/birth was not evaluated. Wider implications of the findings A novel AI approach for preimplantation embryo screening using clinical metadata and time-lapse videos can improve our ability to non-invasively predict euploid likelihood prior to confirmatory diagnostic preimplantation genetic testing. Trial registration number Not applicable