Abstract Study question Can tracking cell division and predicting human embryo cleavage stages be automated in time-lapse videos (TLV) using AI object detection methods? Summary answer We developed software predicting blastomere count and tracking cell cleavages up until 4-5 stage. The software employs object detection technique called YOLOv5 to detect cells. What is known already Embryo morphology plays an important part in determining viability. Parameters such as number of cells present following fertilization, abnormal cell division (reverse/direct) and evaluating cleavage stages have correlation with pregnancy rates. However, continuous manual evaluation can be time-consuming, and automation will assist in embryo viability assessment. YOLOv5 has proven to accurately detect objects in videos. YOLOv5 uses mean average precision (mAP) as a metric to quantify the portions of frames in videos having the correct count of the objects. Study design, size, duration We have developed a software that uses YOLOv5 to detect cells present in frames of TLV, then marks each cell boundary with different colored circular overlays using OpenCV. We trained YOLOv5 to detect objects: cell, morula and blastocyst using 150 images of different cell-stages, morula, blastocyst. For object cell mAP was 0.65. Annotated location of objects in images and YOLOv5 predictions were reviewed by embryologists. We evaluated the software on TLV from 11 patients. Participants/materials, setting, methods After YOLOv5 detects cells in frames of TLV, our software computes cell count and assigns each cell a different color which is maintained until cell division into daughter cells. Later, daughter cells were also assigned different colors. If the frame has a preceding frame, software calculates detected cells' proximity with each cell in the preceding frame and copies color scheme provided proximity is within some threshold. The software provides TLV with colored overlays as output. Main results and the role of chance In starting frames of TLV with single cell, software accurately detected 1-cell (high precision=0.99, high recall=0.83, high F1-score=0.90). We observed some misclassification between 1-cell and morula. The reason could be that compacted morula looks like 1-cell. Best performance is observed for 2-cells (high precision=0.91, high recall=0.98, high F1-score=0.95). 4-cells were sometimes misclassified with 3 or 5-cells (high precision=0.88, low recall=0.59, high F1-score=0.71). One reason for the misclassification can be that overlapping between cells increases with number of cells. 3-cell and 5-cell are confused with other stages, still cleavage stage detection is better than random: 3-cell (average precision=0.43, high recall=0.83, average F1-score=0.49), 5-cell (average precision=0.44, average recall=0.40, average F1-score=0.40). For cell-stages>5, YOLOv5 detects less cells than actual count and software predicts cleavage later than actual by 9-10 frames on average. The proximity threshold used was 0.10 for cell-count<4 and 0.05 for count>4. In 5 TLV, overlay color for cells changes abruptly between frames, possibly because once YOLOv5 detected a stage, in consecutive frames less cell-number was recorded, and then again reported correct count. Sometimes, software selected the wrong parent for daughter cells (incorrect colored overlay). 2 TLV had direct and reverse cleavages and software could detect these two patterns. Limitations, reasons for caution Overall, our software can precisely detect cells, cell divisions and cleavage stages up to 4-cell stages. We hypothesize that training YOLOv5 on a bigger dataset and including several focal plane information will enable our software to detect overlapping cells and cleavage stages > =5. Wider implications of the findings Object detection proved to be pragmatic for ART and tracking cell division using our software will reduce time consumed in manual annotations, easier prediction of abnormal cleavages and more objective assessments. Qualitative evaluation by embryologists resulted in the overall verdict that this is useful and promising for further development. Trial registration number not applicable