Abstract

Missing genotypes are a common feature of high density SNP datasets obtained using SNP chip technology and this is likely to decrease the accuracy of genomic selection. This problem can be circumvented by imputing the missing genotypes with estimated genotypes. When implementing imputation, the criteria used for SNP data quality control and whether to perform imputation before or after data quality control need to consider. In this paper, we compared six strategies of imputation and quality control using different imputation methods, different quality control criteria and by changing the order of imputation and quality control, against a real dataset of milk production traits in Chinese Holstein cattle. The results demonstrated that, no matter what imputation method and quality control criteria were used, strategies with imputation before quality control performed better than strategies with imputation after quality control in terms of accuracy of genomic selection. The different imputation methods and quality control criteria did not significantly influence the accuracy of genomic selection. We concluded that performing imputation before quality control could increase the accuracy of genomic selection, especially when the rate of missing genotypes is high and the reference population is small.

Highlights

  • Genomic selection is becoming prevalent and practicable in dairy cattle breeding, where genomic breeding values of animals are estimated using high density single nucleotide polymorphisms (SNPs) and are the basis for the selection of elite animals [1]

  • The data sets generated using strategies S1, S2, S3 and S4 contained the same number of animals, but different numbers of SNPs

  • Missing genotype information is a common feature of high density SNP datasets which, after data quality control, reduces the number of available SNPs as well as the number of individuals available for estimating SNP effects

Read more

Summary

Introduction

Genomic selection is becoming prevalent and practicable in dairy cattle breeding, where genomic breeding values of animals are estimated using high density single nucleotide polymorphisms (SNPs) and are the basis for the selection of elite animals [1]. Genomic selection combines information on genotypes, phenotypes and pedigree to increase the accuracy of the estimated breeding values (EVBs). Genomic estimated breeding values (GEBVs) are at the core of genomic selection. The routine data quality control procedure in genomic selection eliminates SNPs and animals with low call rates from the data sets, resulting in the loss of information and a decrease in the accuracy of the GEBV. Imputation can be used to deduce the missing genotypes and could be helpful in increasing the accuracy of genomic selection. Imputation allows for the use of low-density chips that may be more cost-effective, facilitating the widespread implementation of whole-genome selection [5,6]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.