Abstract

Genome-wide association studies (GWAS) for quantitative traits and disease in humans and other species have shown that there are many loci that contribute to the observed resemblance between relatives. GWAS to date have mostly focussed on discovery of genes or regulatory regions habouring causative polymorphisms, using single SNP analyses and setting stringent type-I error rates. Genome-wide marker data can also be used to predict genetic values and therefore predict phenotypes. Here, we propose a Bayesian method that utilises all marker data simultaneously to predict phenotypes. We apply the method to three traits: coat colour, %CD8 cells, and mean cell haemoglobin, measured in a heterogeneous stock mouse population. We find that a model that contains both additive and dominance effects, estimated from genome-wide marker data, is successful in predicting unobserved phenotypes and is significantly better than a prediction based upon the phenotypes of close relatives. Correlations between predicted and actual phenotypes were in the range of 0.4 to 0.9 when half of the number of families was used to estimate effects and the other half for prediction. Posterior probabilities of SNPs being associated with coat colour were high for regions that are known to contain loci for this trait. The prediction of phenotypes using large samples, high-density SNP data, and appropriate statistical methodology is feasible and can be applied in human medicine, forensics, or artificial selection programs.

Highlights

  • Results from linkage analyses and, more recently, genome-wide association studies (GWAS) imply that a large number of loci underlie the genetic architecture of complex traits [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

  • Analysis is most frequently performed one SNP at a time. These studies may not properly capture all of the genetic variation that is present in the samples, The initial wave of GWAS has found many genetic variants that are robustly associated with disease or quantitative traits, but these variants typically explain only a small fraction of the genetic variance, and so the utility of predictions made using this information can be limited

  • Model selection approaches are required to find the set of SNPs that best explains and predicts variation in phenotype. Such approaches have already been proposed for mapping multiple quantitative trait loci (QTL) [18,19,20,21,22,23] and recently a method was suggested for the simultaneous analysis of all SNPs in a GWAS [24]

Read more

Summary

Introduction

Results from linkage analyses and, more recently, genome-wide association studies (GWAS) imply that a large number of loci underlie the genetic architecture of complex traits [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]. Analysis is most frequently performed one SNP at a time These studies may not properly capture all of the genetic variation that is present in the samples, The initial wave of GWAS has found many genetic variants that are robustly associated with disease or quantitative traits, but these variants typically explain only a small fraction of the genetic variance, and so the utility of predictions made using this information can be limited. Model selection approaches are required to find the set of SNPs that best explains and predicts variation in phenotype Such approaches have already been proposed for mapping multiple quantitative trait loci (QTL) [18,19,20,21,22,23] and recently a method was suggested for the simultaneous analysis of all SNPs in a GWAS [24]. We predict unobserved phenotypes for individuals based on genome-wide SNP data only, family information (without genetic data) only, or on a combination of the two

Methods
Author Summary
Results
Discussion
Nreplicates
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call