Abstract Study question Are the embryologists across several IVF clinics concordant when evaluating embryo morphology? Summary answer Embryo morphological grading is sufficiently consistent among embryologists from the same center, while an interactive training was essential to improve its accuracy across several clinics. What is known already Embryo morphology, mostly at the blastocyst stage, is the strongest non-invasive embryological feature that associates with implantation potential. This association is confirmed also when euploid blastocysts are transferred. At present, several embryo grading schemes exist but is still unclear which is the most effective among them. Moreover, many IVF clinics adopt internal embryo grading scores, further limiting the transferability of this crucial prognostic information across different laboratories. With the aim of assessing the level of concordance in embryo grading within and between IVF clinics, the Italian Society of Embryology, Reproduction and Research (SIERR) conceived this study. Study design, size, duration We photographed 40 cleavage-stage and 40 blastocyst-stage embryos (3 focal-planes=240 photos). Two embryologists (senior and junior) from 65 Italian IVF clinics were invited to grade them. Their evaluations were blindly collected as Phase-I (January2020-March2020). Phase-II consisted of an interactive-training on Google-Classroom during which 6 selected experts found a Consensus on the morphological evaluation of the 80 embryos (April2020). As Phase III (May2020-July2020), a second set of 240 pictures was sent to senior participants and experts. Participants/materials, setting, methods Eighteen centers agreed to participate, and 36 embryologists were included. The embryo grading scheme adopted was the Alpha-ESHRE Istanbul Consensus (parameters: cleavage-stage blastomeres’ symmetry and fragmentation, blastocyst’s expansion, inner-cell-mass and trophectoderm quality), conventionally used in 50% of the centers (N = 9/18). The concordance within (junior versus senior) and between (senior versus experts) centers was calculated through the Cohen’s-k. The concordance between centers was compared before and after the interactive training on the two sets of pictures. Main results and the role of chance The centers and embryologists included were representative of the Italian IVF scenario: oocyte-retrievals per year:711±636,range100–2200; cycles with cleavage-stage embryo-transfer:322±339,0–1300; cycles with blastocyst-stage embryo-transfer:390±403,0–1100; operators per center:5.6±4.0,2–13; senior embryologists’ experience:14.8±7.4yr,7–30; junior embryologists’ experience:2.7±0.6yr,1–3. The intra-center concordance was (i)for blastomeres’ symmetry 82±15% (38–100%), k 0.59±0.27 (0.02–1), (ii)for blastomeres’ fragmentation 88±9% (65–100%), k 0.71±0.2 (0.29–1), (iii)for blastocysts’ expansion 80±16% (48–100%), k 0.66±0.27 (0.19–1), (iv)for inner-cell-mass quality 73±16% (35–95%), k 0.58±0.24 (0.07–0.92), (v)for trophectoderm quality 71±19% (38–95%), k 0.54±0.32 (0.01–0.97). Linear regressions showed no association of centers’ and embryologists’ characteristics with all k-values. Among clinics with the highest mean number of cycles per year and intra-center concordance, we selected 6 experts for the interactive-training. We then calculated the inter-center concordance as the agreement rate between senior embryologists and the experts for phase-I and phase-III: (i)for blastomeres’ symmetry 67±15% (30–85%) and 73±17% (15–90%;Wilcoxon-signed-ranks-test=0.06), k 0.33±0.22 (–0.29–0.58) and 0.42±0.33 (–0.56–0.77); (ii)for blastomeres’ fragmentation 81±17% (23–95%) and 83±14% (50–95%;Wilcoxon-signed-ranks-test=0.8), k 0.54±0.22 (–0.05–0.84) and 0.55±0.22 (0.17–0.81); (iii)for blastocysts’ expansion 59±16% (35–85%) and 67±17% (23–90%;Wilcoxon-signed-ranks-test=0.04), k 0.35±0.20 (0.06–0.73) and 0.44±0.22 (–0.10–0.7); (iv)for inner-cell-mass quality 60±14% (33–80%) and 69±11% (48–85%;Wilcoxon-signed-ranks-test=0.02), k 0.40±0.20 (0.01–0.69) and 0.51±0.18 (0.18–0.77); (v)for trophectoderm quality 55±12% (23–70%) and 63±10% 48–78%;Wilcoxon-signed-ranks-test<0.01), k 0.29±0.15 (–0.08–0.52) and 0.42±0.15 (0.21–0.66). Limitations, reasons for caution Only 28% (N = 18/65) of the Italian IVF centers invited to participate responded to the survey. The conventional adoption of grading schemes other than Istanbul-Consensus by 50% of the embryologists might have biased their evaluation. The experts were not fully-concordant in grading 13.8% of the embryos (N = 22/160), which required active discussions. Wider implications of the findings: Blastocyst-grading concordance was significantly improved after the training phase. Therefore, interactive consensus meetings and training platforms are keenly needed to standardize this practice across the centers. The “avant-garde” of artificial intelligence applied to embryo image analysis might help overcoming this issue in the future. Trial registration number N.A.