IntroductionPolygenic Scores (PGSs) assess cumulative genetic risk variants that contribute to the association with complex diseases like Alzheimer’s Disease (AD). The PGS Catalog is a valuable repository of PGSs of various complex diseases, but it lacks standardized annotations and harmonization, making the information difficult to integrate for a specific disease.MethodsIn this study, we curated 44 PGS datasets for AD from the PGS Catalog, categorized them into five methodological groups, and annotated 813,257 variants to nearby genes. We aligned the scores based on the “GWAS significant variants” (GWAS-SV) method with the GWAS Catalog and flagged redundant files and those with a “limited scope” due to insufficient external GWAS support. Using rank aggregation (RA), we prioritized consistently important variants and provided an R package, “PgsRankRnnotatR,” to automate this process.ResultsOf the six RA methods evaluated, “Dowdall” method was the most robust. Our refined dataset, enhanced by multiple RA options, is a valuable resource for AD researchers selecting PGSs or exploring AD-related genetic variants.DiscussionOur approach offers a framework for curating, harmonizing, and prioritizing PGS datasets, improving their usability for AD research. By integrating multiple RA methods and automating the process, we provide a flexible tool that enhances PGS selection and genetic variant exploration. This framework can be extended to other complex diseases or traits, facilitating broader applications in genetic risk assessment.
Read full abstract