Abstract

Despite rapid advances in genomic technology, our ability to account for phenotypic variation using genetic information remains limited for many traits. This has unfortunately resulted in limited application of genetic data towards preventive and personalized medicine, one of the primary impetuses of genome-wide association studies. Recently, a large proportion of the “missing heritability” for human height was statistically explained by modeling thousands of single nucleotide polymorphisms concurrently. However, it is currently unclear how gains in explained genetic variance will translate to the prediction of yet-to-be observed phenotypes. Using data from the Framingham Heart Study, we explore the genomic prediction of human height in training and validation samples while varying the statistical approach used, the number of SNPs included in the model, the validation scheme, and the number of subjects used to train the model. In our training datasets, we are able to explain a large proportion of the variation in height (h2 up to 0.83, R2 up to 0.96). However, the proportion of variance accounted for in validation samples is much smaller (ranging from 0.15 to 0.36 depending on the degree of familial information used in the training dataset). While such R2 values vastly exceed what has been previously reported using a reduced number of pre-selected markers (<0.10), given the heritability of the trait (∼0.80), substantial room for improvement remains.

Highlights

  • Few examples exist of findings from Genome Wide Association Studies (GWAS) being applied to preventive and personalized medicine

  • Predictive models from GWAS are constructed using a small number of Single Nucleotide Polymorphisms (SNPs) that have been pre-selected using extremely low p-values derived from single-marker regressions

  • Our results are concordant with the Yang Study, demonstrating that much of the variance in human height can be accounted for using Whole Genome Prediction (WGP) methods based on common SNPs

Read more

Summary

Introduction

Few examples exist of findings from Genome Wide Association Studies (GWAS) being applied to preventive and personalized medicine. Despite the success of GWAS in the discovery of many novel disease variants, the variants identified as being statistically significant typically account for minimal fractions of the genetic variance, even for highly heritable traits [1] This so-called ‘‘missing heritability’’ has prompted a wide array of explanations, ranging from poor modeling (e.g., unaccounted epistatic effects) [2,3], insufficient sample sizes [4], sparse genetic coverage [5], rare variants [6], undetected CNV effects [7], and over-estimated heritability [1,8,9]. Drawing on methods commonly used in animal breeding [17], the Yang Study built a model for human height (a model trait that has recently received much attention because of its high heritability and relatively reliable phenotyping) with hundreds of thousands of SNPs jointly considered (see Visscher et al [18] for an expanded commentary on the methodology employed)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call