Abstract Study question Is it possible to identify genetic factors influencing the number of MII oocytes produced in the course of ovarian stimulation? Summary answer Applying data exploration techniques made it possible to identify gene variants that markedly improved ability to predict the number of MII oocytes to be retrieved. What is known already There are well known clinical factors that correlate with the number of MII oocytes that can be retrieved as a result of ovarian stimulation. When deciding on an appropriate stimulation protocol and gonadotropin doses, clinicians mainly take into consideration patient’s age and selected markers of ovarian reserve. The choice of protocol and dosage itself influences the outcome of the stimulation. Certain genes have been shown to participate in the regulation of the expression of FSH, LG and HCG receptors although direct influence of genetic background on the number of MII oocytes still need to be elucidated. Study design, size, duration In the retrospective study, data collected between 05/08/2013 and 28/10/2020 was analyzed. The dataset consisted of 516 ovarian stimulations undergone by 264 patients as well as and 605 unique changes found in sequence data for 14 genomes (AMH, AMHR2, FSHB, FSHR, LHB, LHCGR, PRL, PRLR, AR, ESR1, ESR2, GDF9, BMP15, SOX3). Inclusion criteria set boundaries on women’s age (between 24 and 46 years) and Anti-Müllerian hormone results (AMH) (lower than 15). Participants/materials, setting, methods Neural Network model based on 7 statistically important clinical factors was trained to predict the number of MII oocytes and used as a benchmark for model including genetic factors. A model’s performance was evaluated with median of absolute error metric and validated on the test subset. Feature importance in the model was assessed using Shapley values. Important gene variants were selected using grouping algorithm Self Organizing Map and haplotype block generation algorithm Four Gamete Rule. Main results and the role of chance In the benchmark model, following clinical factors were included: AMH, antral follicle count on the first stimulation day (AFC), patient age, number of MII oocytes and denuded cumuluses obtained in the previous cycle (available for 37% of IVF processes), IVF protocol type and the presence of Polycystic ovary syndrome (PCOS). The trained model obtained median absolute error equal to 1,62. Out of all genetic data, 3 important variant combinations (haplotypes IV40-8, IV22-2 and a group of 6 variants labeled IV8-6) included in table 1 were identified. The addition of those 3 factors to the model resulted in a decrease in the median absolute error by 0.19 (12%). The average impact on the model output magnitude (mean SHAP) of genetic factors was greater than the impact of the chosen IVF protocol or the presence of PCOS. The presence of haplotypes IV40-8 and IV22-2 increased the expected number of MII oocytes, whereas the grater the number of variants grouped as IV8-6 found in the patient the lower the predicted number of MII oocytes. Additionally, if the patient’s previous cycle results were not given, the impact of genetic factors on the model magnitude was comparable to AFC on the first day of stimulation. Limitations, reasons for caution Considering the number of gene variants found (605) And the sample size of the study a broader study allowing for a better understanding of the influence of less frequently observed variants would be beneficial. Wider implications of the findings Research shows including gene factors in MII oocytes prediction could substantially increase the precision of MII oocyte predictions for patients and especially for patients undergoing ovarian stimulation for the first time. Knowing the expected MII oocytes count could help clinicians with deciding on gonadotropin dosing. Trial registration number Not applicable