Abstract Study question Can a machine learning (ML) algorithm suggest an optimal controlled ovarian stimulation (COS) protocol for maximizing the number of retrieved and mature (MII) oocytes? Summary answer Based on historical electronic medical record patient data, using ML algorithm based medication regimen recommendation model may increase the number of retrieved oocytes. What is known already The application of artificial intelligence in the field of reproductive medicine is mainly focused on the image analysis of gamete and embryo, and prediction of treatment outcomes. The research on the optimization of COS process is limited. Previous studies mainly focused on one or several types of COS protocols such as GnRH-a protocol and GnRH-A protocol. And most of them aimed to assist single-node decision-making during COS process such as optimal gonadotropins starting dose and trigger day. Study design, size, duration The data used for developing this model consists of 38,355 COS cycles from women who underwent their first IVF/ICSI treatment in a large fertility center between January 2008 to December 2022, of which 33,853 were used as the model training set and 4,502 were used as the test set. This dataset contains a subset of 8,382 cycles with information about MII oocytes. Participants/materials, setting, methods The goal of model was to recommend the optimal COS protocol based on optimizing COS outcomes. The similarity of patient’s characteristic was used as model input. XGBoost algorithm was constructed to calculate the similarity of COS outcomes. The accuracy was compared with the patient similarity calculation method of KNN+Euclidean distance and KNN+cosine distance using mean absolute error (MAE). The performance of different models in improving the average number of retrieved and MII oocytes was compared. Main results and the role of chance The patient similarity calculation method based on XGBoost had lowest MAE when predicting the number of retrieved oocytes (XGBoost vs KNN+Euclidean distance+k=500 vs KNN+Euclidean distance+k=1000 vs KNN+cosine distance+k=500 vs KNN+cosine distance+k=1000: 3.729 vs 3.810 vs 3.815 vs 3.785 vs 3.788). Including AMH could improve the accuracy of all recommendation models in predicting protocols and the number of retrieved oocytes. When the optimal number of MII oocytes is used as the protocol recommended target, the patient similarity calculation method based on XGBoost had lowest MAE (XGBoost vs KNN+Euclidean distance+k=500 vs KNN+Euclidean distance+k=1000 vs KNN+cosine distance+k=500 vs KNN+cosine distance+k=1000: 2.868 vs 3.397 vs 3.452 vs 3.356 vs 3.395). The COS medication regimen recommended by the patient similarity calculation method based on COS treatment outcomes has resulted in higher numbers of retrieved oocytes in patients compared to those who did not adopt recommendation (P < 0.01). The medication regimen recommended model based on KNN+Euclidean distance algorithm (k = 500) has the best performance in improving the average number of retrieved oocytes(adopt recommendation vs not adopt recommendation: 13.01±6.61 vs 10.09±6.67, P < 0.01)There was no statistical difference in the number of MII oocytes between patients receiving the recommendation and those not adopt recommendation. Limitations, reasons for caution The medication regimen recommendation model is based solely on the predicted number of retrieved and MII oocytes, not including adverse treatment outcomes such as ovarian hyperstimulation syndrome. Moreover, the database used to train and validate the model is from a single fertility center. Wider implications of the findings The recommendation model is helpful to improve the number of retrieved oocytes. Based on the recommended COS protocol, further mining the medication content, medication time distribution and medication sequence transformation from the similar patient set can assist clinicians to develop individualized COS medication regimen for infertile patients. Trial registration number not applicable