Abstract

There is currently much debate regarding the best model for how heritability varies across the genome. The authors of GCTA recommend the GCTA-LDMS-I model, the authors of LD Score Regression recommend the Baseline LD model, and we have recommended the LDAK model. Here we provide a statistical framework for assessing heritability models using summary statistics from genome-wide association studies. Based on 31 studies of complex human traits (average sample size 136,000), we show that the Baseline LD model is more realistic than other existing heritability models, but that it can be improved by incorporating features from the LDAK model. Our framework also provides a method for estimating the selection-related parameter α from summary statistics. We find strong evidence (P < 1 × 10-6) of negative genome-wide selection for traits, including height, systolic blood pressure and college education, and that the impact of selection is stronger inside functional categories, such as coding SNPs and promoter regions.

Highlights

  • Our earlier work[1] compared the GCTA and LDAK Models based on the likelihood from restricted maximum likelihood[11] (REML)

  • When software for estimating SNP heritability were first developed, little attention was given to the heritability model, and it was standard to assume that all SNPs are expected to contribute equal heritability.[2,3]

  • Heritability models have previously been compared based on REML likelihood,[1,20] prediction accuracy[4,20] and performance on simulated data,[7,8] all three approaches have shortcomings; the REML likelihood requires individual-level data and can not be computed for complex heritability models, to measure prediction accuracy requires two independent datasets for each trait and there is no consensus regarding the best prediction method, while comparisons of heritability models based on simulated data are sensitive to the assumptions of the simulation model.[1]

Read more

Summary

Introduction

Our earlier work[1] compared the GCTA and LDAK Models based on the likelihood from restricted maximum likelihood[11] (REML). This approach requires access to individual-level data and is only feasible for relatively simple heritability models. We propose an approximate model likelihood that can be computed from genome-wide association study (GWAS) summary statistics and for highly complex heritability models

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call