Abstract

With the availability of high-density single-nucleotide polymorphism (SNP) data and the development of genotype imputation methods, high-density panel-based genomic prediction (GP) has become possible in livestock breeding. It is generally considered that the genomic estimated breeding value (GEBV) accuracy increases with the marker density, while studies have shown that the GEBV accuracy does not increase or even decrease when high-density panels were used. Therefore, in addition to the SNP number, other measurements of 'marker density' seem to have impacts on the GEBV accuracy, and exploring the relationship between the GEBV accuracy and the measurements of 'marker density' based on high-density SNP or whole-genome sequence data is important for the field of GP. In this study, we constructed different SNP panels with certain SNP numbers (e.g., 1 k) by using the physical distance (PhyD), genetic distance (GenD) and random distance (RanD) between SNPs respectively based on the high-density SNP data of a Germany Holstein dairy cattle population. Therefore, there are three different panels at a certain SNP number level. These panels were used to construct GP models to predict fat percentage, milk yield and somatic cell score. Meanwhile, the mean (d¯) and variance (σd2) of the physical distance between SNPs and the mean (r2¯) and variance (σr22) of the genetic distance between SNPs in each panel were used as marker density-related measurements and their influence on the GEBV accuracy was investigated. At the same SNP number level, the d¯ of all panels is basically the same, but the σd2, r2¯ and σr22 are different. Therefore, we only investigated the effects of σd2, r2¯ and σr22 on the GEBV accuracy. The results showed that at a certain SNP number level, the GEBV accuracy was negatively correlated with σd2, but not with r2¯ and σr22. Compared with GenD and RanD, the σd2 of panels constructed by PhyD is smaller. The low and moderate-density panels (< 50 k) constructed by RanD or GenD have large σd2, which is not conducive to genomic prediction. The GEBV accuracy of the low and moderate-density panels constructed by PhyD is 3.8~34.8% higher than that of the low and moderate-density panels constructed by RanD and GenD. Panels with 20-30 k SNPs constructed by PhyD can achieve the same or slightly higher GEBV accuracy than that of high-density SNP panels for all three traits. In summary, the smaller the variation degree of physical distance between adjacent SNPs, the higher the GEBV accuracy. The low and moderate-density panels construct by physical distance are beneficial to genomic prediction, while pruning high-density SNP data based on genetic distance is detrimental to genomic prediction. The results provide suggestions for the development of SNP panels and the research of genome prediction based on whole-genome sequence data.

Highlights

  • Implementing genomic selection can increase genetic gain, which has been demonstrated in livestock [1,2], plants [3,4] and aquatic [5,6]

  • In the genetic analysis based on high-density panels, markers were usually thinned by linkage disequilibrium (LD) threshold [9,10], while moderate or low-density panels are generally constructed by single-nucleotide polymorphism (SNP) distributed evenly in the whole genome [11,12]

  • For the panels constructed by the genetic distance (GenD) or random distance (RanD) method, the σd2 was large, this is most obvious in the panels with few SNPs constructed by the GenD method (Figure 2A)

Read more

Summary

Introduction

Implementing genomic selection can increase genetic gain, which has been demonstrated in livestock [1,2], plants [3,4] and aquatic [5,6]. Next-generation sequencing efforts have uncovered the genome sequences of many species and revealed thousands of single-nucleotide polymorphism (SNP) markers, making the genomic prediction (GP) based on high-density SNP or whole-genome sequence (WGS) data possible [7,8]. In the genetic analysis based on high-density panels, markers were usually thinned by linkage disequilibrium (LD) threshold (genetic distance) [9,10], while moderate or low-density panels are generally constructed by SNPs distributed evenly in the whole genome [11,12]. The effect of marker selection methods (physical or genetic distance) on the genomic prediction performance needs to be studied in detail

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call