Abstract Study question To evaluate the performance of Machine and Deep learning classification models for sperm DNA fragmentation testing. Summary answer Machine and Deep learning models achieved an accuracy over 90% for sperm DNA fragmentation testing. What is known already Artificial Intelligence (AI) is an up-to-date tool that could improve current diagnostics in reproductive medicine. Paternal genome integrity is essential after fertilization. Therefore, high levels of DNA damage have been associated with male infertility. Sperm DNA fragmentation (SDF) is a well-known male fertility biomarker. Among different diagnostic tests analysing SDF, the Comet assay offers high specificity and sensibility. Human expert-analysis has shown to be highly precise but can be influenced by external factors, samples’ variability or tiredness. Consequently, computer-assisted diagnostic based on AI models may be helpful to improve the Comet assay analysis. Study design, size, duration From February to June 2021, alkaline and neutral Comet assays tests were performed on semen samples from a heterogenous group of men. The CometAssay IVTM software was used to record quantitative data from spermatozoa: 500 normal and altered spermatozoa were assessed after the alkaline and neutral Comet assay. Images for each spermatozoon were also taken. The total size was 2000 analysed spermatozoa. Finally, a first validation step was performed using new data from 20000 spermatozoa. Participants/materials, setting, methods For each spermatozoon, ten quantitative parameters and a grayscale image were obtained. A Machine learning predictive model using the Random Forest (RF) algorithm was trained with the quantitative parameters. Moreover, a Deep learning Convolutional Neural Network (CNN) algorithm was trained with cell images. Both models were trained on 67% of data and tested using the remaining 33%. Main results and the role of chance Predictive models based on RF and CNN showed high performance for normal/altered cells automatic classification. The accuracy achieved by the RF models was 95.51% for the alkaline Comet assay and 92.64% for the neutral Comet assay. Regarding the CNN models, the accuracy was 96.71% for the alkaline Comet assay and 93.19% for the neutral Comet assay. CNN models showed better accuracy on both assays. Regarding the quantitative parameters considered in the RF model, the most important parameters for classification are the following ones: Mean_Grey_Level and Total_Intensity in the alkaline Comet assay, and Tail_Migration and Tail_Length in the neutral Comet assay. Finally, 20000 spermatozoa from 100 semen samples were analysed to compare the AI models result with the annotation from an expert human. The Kruskal-Wallis test did not show significant differences for the alkaline and the neutral Comet assays (p > 0.05 for all cases). Paired comparisons using the Mann-Whitney U test did not show statistical differences (p > 0.05 for all cases). According to these results, AI models may reproduce a human analysis. To facilitate the use in the laboratory of the obtained models, a web application was developed to process new samples. Limitations, reasons for caution Diagnostic assays based on continuous variables include a threshold value to separate normal and altered populations. The final result of some samples could be mistaken when a high number of spermatozoa present a fragmentation index near the cut-off value. Wider implications of the findings Diagnostic of male infertility through the analysis of SDF can be achieved through AI predictive models. This technology might help in the standardization of SDF testing between laboratories. Trial registration number not applicable
Read full abstract