Polygenic risk scores (PRSs) are promising tools for advancing precision medicine. However, existing PRS construction methods rely on static summary statistics derived from genome-wide association studies (GWASs), which are often updated at lengthy intervals. As genetic data and health outcomes are continuously being generated at an ever-increasing pace, the current PRS training and deployment paradigm is suboptimal in maximizing the prediction accuracy of PRSs for incoming patients in healthcare settings. Here, we introduce real-time PRS-CS (rtPRS-CS), which enables online, dynamic refinement and calibration of PRS as each new sample is collected, without the need to perform intermediate GWASs. Through extensive simulation studies, we evaluate the performance of rtPRS-CS across various genetic architectures and training sample sizes. Leveraging quantitative traits from the Mass General Brigham Biobank and UK Biobank, we show that rtPRS-CS can integrate massive streaming data to enhance PRS prediction over time. We further apply rtPRS-CS to 22 schizophrenia cohorts in 7 Asian regions, demonstrating the clinical utility of rtPRS-CS in dynamically predicting and stratifying disease risk across diverse genetic ancestries.
Read full abstract