Abstract
Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.
Highlights
Genome-wide association studies (GWAS) have been conducted for thousands of human complex traits, revealing that the genetic architecture is almost always highly polygenic, that is that the bulk of the heritable variation is due to thousands of genetic variants, each with tiny marginal effects (Boyle et al, 2017; Bulik-Sullivan et al, 2015)
We examined how polygenic score’ (PGS) for a few example traits port across samples that are of similar genetic ancestry but differ in terms of some common study characteristics, such as the male:female ratio ( ‘sex ratio’), age distribution, or socio-economic status (SES)
We limited our analysis to the largest subset of individuals in the UK Biobank (UKB) with a relatively homogeneous ancestry: 337,536 unrelated individuals that were characterized by the UKB, based on self-reported ethnicities as well as genetic analysis, as ‘White British’ (WB) (Bycroft et al, 2018)
Summary
Genome-wide association studies (GWAS) have been conducted for thousands of human complex traits, revealing that the genetic architecture is almost always highly polygenic, that is that the bulk of the heritable variation is due to thousands of genetic variants, each with tiny marginal effects (Boyle et al, 2017; Bulik-Sullivan et al, 2015) These findings make it difficult to interpret the molecular basis for variation in a trait, but they lend themselves more immediately to another use: phenotypic prediction. PGS have been shown to help identify individuals that are more likely to be at risk of diseases such as breast cancer and cardiovascular disease (Khera et al, 2018; Inouye et al, 2018; Mavaddat et al, 2019; Khera et al, 2019) Based on these findings, a number of papers have advocated that PGS be adopted in designing clinical studies, and by clinicians as additional risk factors to consider in treating patients (Torkamani et al, 2018; Khera et al, 2018). Several lines of evidence suggest that adaptation may often take the form of shifts
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have