Although many sequence variants have been discovered in cattle, deciphering the relationship between genome and phenome remains a significant challenge. In this study, we identified functional classes, including mammary-specific genes, lactation-associated genes, novel long non-coding RNAs, miRNAs, RNA editing sites, DNA methylation, histone modifications, and expression quantitative trait loci. We estimated their contributions to genetic variance for milk production traits using 3 million variants in 23,566 Holstein bulls. Sequence variants in the 5'-UTR, synonymous, and splicing regions disproportionately contributed to genetic variance of milk production traits compared to other genomic regions. Genes specifically expressed in the mammary gland, particularly those active in lactating tissue (e.g., GLYCAM1, DGAT1), account for significantly more genetic variance of milk production traits than specific genes from non-mammary tissues. We identified 8,560 differentially expressed genes (DEGs) between lactating and non-lactating tissues. Among these, both up-regulated and small-fold changes of down-regulated DEGs exhibited greater genetic variance enrichment of milk production traits than other genes. Mammary enhancers (e.g., H3K27ac, H3K4Me1) explained more variance than repressive elements, while small changes in DNA methylation level (≤0.2) contributed more variance than that with larger changes (> 0.2). Notably, lactation-associated RNA editing sites in mammary explained more variance for milk production traits than expected by chance. We proposed a novel miRNA prioritization strategy for selecting candidate miRNAs related to milk production traits, based on the overlaps between significant enrichment tests of miRNA target correlations and the relatively large variance explained by these targets. Additionally, we integrated these nine functional classes into the variance component analysis simultaneously, revealing that sQTLs, histone modification and DEGs showed the highest per-SNP variance enrichment. Finally, we constructed a new 624K SNP panel, which improved the reliabilities of genomic predictions by 0.22%. Dividing routine SNPs into two groups based on functional classes improved the reliabilities by 0.21%, particularly for milk protein percentage (0.68% improvement). Overall, incorporating prior biological knowledge of the mammary gland directly enhances our understanding of milk production's genetic architecture and improves the reliability of genomic predictions for milk production traits. This integrative approach establishes a paradigm for translating biological knowledge into agricultural genomics applications.
Read full abstract