The relatively low representation of admixed populations in both discovery and fine-tuning individual-level datasets limits polygenic risk score (PRS) development and equitable clinical translation for admixed populations. Under the assumption that the most informative PRS weight for a homogeneous sample varies linearly in an ancestry continuum space, we introduce a Genetic Dis tance-assisted PRS Co mbination Pipeline for Div erse Genetic A ncestrie s ( DiscoDivas ) to interpolate a harmonized PRS for diverse, especially admixed, ancestries, leveraging multiple PRS weights fine-tuned within single-ancestry samples and genetic distance. DiscoDivas treats ancestry as a continuous variable and does not require shifting between different models when calculating PRS for different ancestries. We generated PRS with DiscoDivas and the current conventional method, i.e. fine-tuning multiple GWAS PRS using the matched or similar ancestry samples. DiscoDivas generated a harmonized PRS of the accuracy comparable to or higher than the conventional approach, with the greatest advantage exhibited in admixed individuals.
Read full abstract