Abstract

Germline disease-causing variants are generally more spatially clustered in protein 3-dimensional structures than benign variants. Motivated by this tendency, we develop a fast and powerful protein-structure-based scan (PSCAN) approach for evaluating gene-level associations with complex disease and detecting signal variants. We validate PSCAN’s performance on synthetic data and two real data sets for lipid traits and Alzheimer’s disease. Our results demonstrate that PSCAN performs competitively with existing gene-level tests while increasing power and identifying more specific signal variant sets. Furthermore, PSCAN enables generation of hypotheses about the molecular basis for the associations in the context of protein structures and functional domains.

Highlights

  • Many whole exome or whole genome sequencing association studies, such as the National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine Program (NHLBI TOPMed) and the National Human Genome Research Institute Genome Sequencing Program (NHGRI GSP), seek to identify genes and variants that influence human complex diseases and traits [1,2,3]

  • protein-structure-based scan (PSCAN) accurately detects simulated signal regions we evaluated the accuracy of the PSCAN procedure for detecting potentially causal variants in simulated disease-associated genes from the scenarios described above

  • We show that the proposed scan tests properly control the type I error rate and achieve substantially higher power compared to standard burden and set kernel association tests (SKAT) tests, as well as 1D scan tests based on chromosome location

Read more

Summary

Introduction

Despite the popularity of these gene-level association tests, their power is limited by the high background rate of neutral variants, even in causal genes. To address this issue, one approach is to only consider variants that are likely to be causal based on their functional annotations. It is a common practice to include only loss-offunction variants or variants with high functional effect predictions from algorithms such as PolyPhen2 [15] and SIFT [16]. These scores have only modest accuracy, and scores from different tools often disagree [17]. Most functional annotations are not phenotype-specific, further limiting their effectiveness in filtering neutral variants from association analysis

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call