Abstract Study question Can gene expression as AURKA, HDAC4, and CARHSP1 be used for the diagnosis of sperm quality exploring male infertility? Summary answer The comprehensive outcome of gene expression involving AURKA, HDAC4, and CARHSP1 serves as a key to classifying sperm and predicting their functionality. What is known already Diagnostic male infertility is largely based on semen parameters according to the World Health Organization (WHO) reference values. While these parameters assess sperm quality, they do not provide insights into sperm quality and have a limited predictive value on natural fecundity, fertilization rates, and assisted reproductive technology (ART) outcomes. Study design, size, duration This prospective study investigates the gene expression involved in mitosis, epigenetic regulation, and early embryo development. From February to June 2023, we included all men aged from 20 to 60 years having sperm parameters analysis in our unit center. A total of 277 semen samples were collected, 129 (46.4%) exhibiting three normal sperm parameters according to WHO criteria. Among these, 32 samples showed oligo-astheno-teratozoospermia (OAT), while 116 samples had 1 or 2 abnormal parameters. Participants/materials, setting, methods The mRNA expression levels of three candidate genes (AURKA, HDAC4, and CARHSP1) were quantified in fresh ejaculate semen using real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR). The thresholds for the spermatozoa function index (SFI) were established by assessing the combined expression of these target genes, considering the morphology and the capability to reach a blastocyst stage. The sperm were ranked according to the concentration (million/ml), motility (%) and morphology (%). Main results and the role of chance We assess the quality of spermatozoa for three biomarkers by analyzing their mRNA expression levels. Receiver operator characteristics curves (ROC) indicate a combined index value expression. This model establishes an index threshold > 320 for an “over-index value,” distinguishing spermatozoa with a normal capacity, an index threshold < 290 for an “under-index value,” and 290 < index < 320 for an “intermediate-index value”. Among the 277, we find 112 (40.4%) have an over-expression index; 161 (58.1%) have an under-expressed index and 4 (1.4%) have an intermediate-index. Interestingly, among the 129 samples with the three normal WHO parameters, we show that 82 (63.5%) have an over-index value, 44 (34.1%) have an under-index value, and 3 (2.3%) display an intermediate-index value. In addition, all the 32 OAT samples have an under-index. These findings suggest that the OAT sperm population implies a potential epigenetic network connected with our three target genes. Furthermore, even in cases with normal sperm parameters, there may be an under-expressed index, indicating lower functionality in terms of fertilization and early embryo development. Limitations, reasons for caution The present method was validated only on ejaculate semen in men free of any chronic disease and any anti-oxidant treatment. This prospective study requires further validation in the largest population. Wider implications of the findings This study’s results could highlight understanding a chaotic early embryo development in ART and used to explore male infertility; particularly in unexplained infertility. Trial registration number not applicable