Background: Inherited hypercholesterolemia is an important Mendelian condition associated with early onset myocardial infarction. However, current genetic diagnostic criteria only partially explain severe clinical hypercholesterolemia. Furthermore, current pathogenicity categorizations do not consistently inform prognosis, but quantification is limited by scarcity of such rare variants. Aim: Exploiting populations scale genotype/phenotype data, to determine the architecture of the genetic variants associating with blood lipid levels with quantitative assessments. Methods: Here, we discovered and characterized rare coding alleles contributing to genetic dyslipidemia, a principal risk for coronary artery disease, among over a million individuals combining two large contemporary genetic datasets (Million Veteran Program N = 626,554 and UK Biobank, N = 431,026). Results: Testing 2,869,269 rare (minor allele frequency < 1%), coding variants, we identified 793 exome-wide significant associations ( P < 4.4 × 10 -11 ), including 8 previously undiscovered loci. Associated alleles are highly enriched in functional variant classes; high confidence predicted loss of function, deleterious missense, and cryptic splicing, showed significant and recessive associations, had similar effects across populations. Overall, rare variants contributed to additional > 16% of genetic trait variances on top of common genetic variants. The effect estimates of associated signals confirmed and refined curated pathogenic database, highlighted by the identification of African or South Asian specific pathogenic alleles previously controversial in its pathogenicity. Conclusion: This study provides resources and insights for understanding causal mechanisms and quantifying the expressivity of rare coding alleles for blood lipids by large scale sequencing study. Furthermore, identifying and characterizing population enriched variants, our study underscored the importance of multi-population rare variant study.
Read full abstract