Accurate genomic selection using low-density SNP panels preselected by maximum likelihood estimation

Shiyu Qu,Sheng Lu,Yang Liu,Ming Li,Songlin Chen

doi:10.1016/j.aquaculture.2023.740154

Abstract

Genomic selection (GS) poses a challenge for the prediction of the genomic estimated breeding value (GEBV) using a low-density SNP panel. Several methods have been proposed for SNP preselection. However, these methods often suffer from either significant computational complexity or erratic accuracy in GS. In this study, we developed an approach called MLE-rank based on maximum likelihood estimation to preselect a set of SNPs for GS. First, we generated 90 simulated datasets and compared the performance of MLE-rank with uniform distribution and preselection based on a genome-wide association study (GWAS). For simulated datasets, compared to uniform distribution, both MLE-rank and GWAS preselection reduced the SNP density by a factor of 10 while maintaining prediction accuracy. Additionally, compared to the other two methods, MLE-rank's prediction accuracy was significantly improved with the medium- and high-heritability datasets. Then, we further evaluated these three preselection approaches using real disease-resistant phenotypes of leopard coral grouper (Plectropomus leopardus) and Japanese flounder (Paralichthys olivaceus). We found that the 3 k SNPs preselected by MLE-rank had a stable and effective prediction effect. The uniform distribution requires 70 k, while the GWAS preselection method requires 3 k (P. leopardus) and 50 k (P. olivaceus) to achieve similar prediction accuracy. Finally, we evaluated the prediction accuracy of MLE-rank using candidate populations of flounders and their progeny survival rates, with uniform distribution and GWAS preselection as benchmarks. In the results for this dataset, MLE-rank was found to have the same predictive effect for low-density SNP panels as it did for high-density SNPs, regardless of whether GWAS preselection or uniform distribution was used. Taken together, the results we have observed indicate that we have ensured that MLE-rank does not reduce prediction accuracy for any of the datasets. MLE-rank showed superior performance in reducing the number of SNPs. Moreover, we observed a relative standard deviation in prediction accuracy when using a low density of SNPs selected by MLE-rank compared to a high density determined through a uniform distribution strategy. In conclusion, MLE-rank not only reduces the number of SNPs used for GS but also exhibits high predictive accuracy. This could potentially lead to a decrease in genotyping costs and promote the wider application of GS in fish breeding.

Full Text