Abstract

BackgroundPolygenic risk scores (PRSs) are a summarization of an individual’s genetic risk for a disease or trait. These scores are being generated in research and commercial settings to study how they may be used to guide healthcare decisions. PRSs should be updated as genetic knowledgebases improve; however, no guidelines exist for their generation or updating.MethodsHere, we characterize the variability introduced in PRS calculation by a common computational process used in their generation—genotype imputation. We evaluated PRS variability when performing genotype imputation using 3 different pre-phasing tools (Beagle, Eagle, SHAPEIT) and 2 different imputation tools (Beagle, Minimac4), relative to a WGS-based gold standard. Fourteen different PRSs spanning different disease architectures and PRS generation approaches were evaluated.ResultsWe find that genotype imputation can introduce variability in calculated PRSs at the individual level without any change to the underlying genetic model. The degree of variability introduced by genotype imputation differs across algorithms, where pre-phasing algorithms with stochastic elements introduce the greatest degree of score variability. In most cases, PRS variability due to imputation is minor (< 5 percentile rank change) and does not influence the interpretation of the score. PRS percentile fluctuations are also reduced in the more informative tails of the PRS distribution. However, in rare instances, PRS instability at the individual level can result in singular PRS calculations that differ substantially from a whole genome sequence-based gold standard score.ConclusionsOur study highlights some challenges in applying population genetics tools to individual-level genetic analysis including return of results. Rare individual-level variability events are masked by a high degree of overall score reproducibility at the population level. In order to avoid PRS result fluctuations during updates, we suggest that deterministic imputation processes or the average of multiple iterations of stochastic imputation processes be used to generate and deliver PRS results.

Highlights

  • Polygenic risk scores (PRSs) are a summarization of an individual’s genetic risk for a disease or trait

  • This algorithm-level variability is observed regardless of the original approach used to derive the PRS and the number of Single-nucleotide polymorphism (SNP) included in the score—as similar variability is observed for three different coronary artery disease (CAD) risk scores derived using very different strategies and including vastly different numbers of SNPs

  • Besides the score variability introduced by computational processes, PRSs evolve over time depending upon the underlying genetic architecture and size of Genomewide association study (GWAS) currently executed

Read more

Summary

Introduction

Polygenic risk scores (PRSs) are a summarization of an individual’s genetic risk for a disease or trait These scores are being generated in research and commercial settings to study how they may be used to guide healthcare decisions. The PRS update process may introduce fluctuations in any and possibly all individual-level scores, whereas updates to monogenic testing results typically involve the rarer reclarification of ambiguous results (the re-classification of a variant of unknown significance). This major difference leads to special considerations for updating of PRSs

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call