Abstract

BackgroundSequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction. However, until now, studies on integrating such SNPs using a single-step genomic best linear unbiased prediction (ssGBLUP) model are scarce. We investigated the integration of sequencing SNPs selected by association (1262 SNPs) and bioinformatics (2359 SNPs) analyses into the currently used 54K-SNP chip, using three ssGBLUP models which make different assumptions on the distribution of SNP effects: a basic ssGBLUP model, a so-called featured ssGBLUP (ssFGBLUP) model that considered selected sequencing SNPs as a feature genetic component, and a weighted ssGBLUP (ssWGBLUP) model in which the genomic relationship matrix was weighted by the SNP variances estimated from a Bayesian whole-genome regression model, with every 1, 30, or 100 adjacent SNPs within a chromosome region sharing the same variance. We used data on milk production and female fertility in Danish Jersey. In total, 15,823 genotyped and 528,981‬ non-genotyped females born between 1990 and 2013 were used as reference population and 7415 genotyped females and 33,040 non-genotyped females born between 2014 and 2016 were used as validation population.ResultsWith basic ssGBLUP, integrating SNPs selected from sequencing data improved prediction reliabilities for milk and protein yields, but resulted in limited or no improvement for fat yield and female fertility. Model performances depended on the SNP set used. When using ssWGBLUP with the 54K SNPs, reliabilities for milk and protein yields improved by 0.028 for genotyped animals and by 0.006 for non-genotyped animals compared with ssGBLUP. However, with the SNP set that included SNPs selected from sequencing data, no statistically significant difference in prediction reliability was observed between the three ssGBLUP models.ConclusionsIn summary, when using 54K SNPs, a ssWGBLUP model with a common weight on the SNPs in a given region is a feasible approach for single-trait genetic evaluation. Integrating relevant SNPs selected from sequencing data into the standard SNP chip can improve the reliability of genomic prediction. Based on such SNP data, a basic ssGBLUP model was suggested since no significant improvement was observed from using alternative models such as ssWGBLUP and ssFGBLUP.

Highlights

  • Sequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction

  • With the BayesN0_WG model, reliabilities increased significantly after integrating all the SNPs selected from sequencing data compared to the use of the 54K-SNP chip (54K) SNP set only, for milk yield (0.045) and protein yield (0.023), but not for fat yield and female fertility traits

  • Similar to the ssWGBLUP model, the ssFGBLUP model, which we introduce for the first time here, was allowed to place more emphasis on the SNPs selected from sequencing data than on the SNPs of the 54K-SNP chip

Read more

Summary

Introduction

Sequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction. A large number of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci have been discovered from sequencing data through association analyses [1, 2] and bioinformatics analyses [3, 4]. Many of these SNPs are not part of the standard SNP chips that are commonly used in routine genomic evaluation, e.g. the Illumina Bovine SNP50 chip. Su et al [17] compared different weighting strategies for GBLUP models and suggested the use of the posterior SNP variance from the Bayesian R model [15] as the weighting factor and a common weight shared by a group of 30 adjacent

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call