Abstract

BackgroundTraditional genomic prediction models using multiple regression on single nucleotide polymorphisms (SNPs) genotypes exploit associations between genotypes of quantitative trait loci (QTL) and SNPs, which can be created by historical linkage disequilibrium (LD), recent co-segregation (CS) and pedigree relationships. Results from field data analyses show that prediction accuracy is usually much higher for individuals that are close relatives of the training population than for distantly related individuals. A possible reason is that historical LD between QTL and SNPs is weak and, for close relatives, prediction accuracy of SNP models is mainly contributed by pedigree relationships and CS. Information from pedigree relationships decreases fast over generations and only contributes to within-family prediction. Information from CS is affected by family structures and effective population size, and can have a substantial contribution to prediction accuracy when modeled explicitly.ResultsIn this study, a method to explicitly model CS was developed by following the transmission of putative QTL alleles using allele origins at SNPs. Bayesian hierarchical models that combine information from LD and CS (LD-CS model) were developed for genomic prediction in pedigree populations. Contributions of LD and CS information to prediction accuracy across families and generations without retraining were investigated in simulated half-sib datasets and deep pedigrees with different recent effective population sizes, respectively. Results from half-sib datasets showed that when historical LD between QTL and SNPs is low, accuracy of the LD model decreased when the training data size is increased by adding independent sire families, but accuracies from the CS and LD-CS models increased and plateaued rapidly. Results from deep pedigree datasets show that the LD model had high accuracy across generations only when historical LD between QTL and SNPs was high. Modeling CS explicitly resulted in higher accuracy than the LD model across generations when the mating design generated many close relatives.ConclusionsOur results suggest that modeling CS explicitly improves accuracy of genomic prediction when historical LD between QTL and SNPs is low. Modeling both LD and CS explicitly is expected to improve accuracy when recent effective population size is small, or when the training data include many independent families.Electronic supplementary materialThe online version of this article (doi:10.1186/s12711-016-0255-4) contains supplementary material, which is available to authorized users.

Highlights

  • Traditional genomic prediction models using multiple regression on single nucleotide polymorphisms (SNPs) genotypes exploit associations between genotypes of quantitative trait loci (QTL) and SNPs, which can be created by historical linkage disequilibrium (LD), recent co-segregation (CS) and pedigree relationships

  • Linkage disequilibrium (LD) between quantitative trait loci (QTL) and SNPs was initially thought to be the only source of genetic information that contributes to accuracy of genomic prediction using SNP models, until [8] and [9] showed that co-segregation (CS) of QTL with SNPs and pedigree relationships that are implicitly captured by SNP genotypes contribute to prediction accuracy

  • In analyses of field datasets using SNP models, high accuracy of genomic prediction has been mainly observed among close relatives [1, 2, 15, 16], and prediction accuracy decreases rapidly when the validation individuals are separated from training individuals by more generations [5, 16,17,18]. The latter does not agree with results from simulation studies in which the LD between QTL and SNPs was high [7,8,9, 19]. These results suggest that LD between QTL and SNPs is low in current livestock populations, and that prediction accuracy of the SNP model mainly comes from CS and pedigree relationships that are implicitly captured by SNP genotypes [8, 10, 16, 17, 20]

Read more

Summary

Introduction

Traditional genomic prediction models using multiple regression on single nucleotide polymorphisms (SNPs) genotypes exploit associations between genotypes of quantitative trait loci (QTL) and SNPs, which can be created by historical linkage disequilibrium (LD), recent co-segregation (CS) and pedigree relationships. A possible reason is that historical LD between QTL and SNPs is weak and, for close relatives, prediction accuracy of SNP models is mainly contributed by pedigree relationships and CS. Linkage disequilibrium (LD) between quantitative trait loci (QTL) and SNPs was initially thought to be the only source of genetic information that contributes to accuracy of genomic prediction using SNP models, until [8] and [9] showed that co-segregation (CS) of QTL with SNPs and pedigree relationships that are implicitly captured by SNP genotypes contribute to prediction accuracy. The average distance between adjacent SNPs on the Illumina Bovine SNP50 BeadChip is 50 kb [13, 14], and the average recombination rate between two adjacent SNPs is only around 0.0005 per meiosis, assuming a typical crossover rate of 1 % per million base pair

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.