Incorporating family disease history and controlling case-control imbalance for population-based genetic association studies.

Yongwen Zhuang,Seunggeun Lee,Cristen J Willer,Kisung Nam,Bhramar Mukherjee,Wenjian Bi,Brooke N Wolford,Wei Zhou,Russell Schwartz

doi:10.1093/bioinformatics/btac459

Abstract

ABSTRACTMotivationIn the genome-wide association analysis of population-based biobanks, most diseases have low prevalence, which results in low detection power. One approach to tackle the problem is using family disease history, yet existing methods are unable to address type I error inflation induced by increased correlation of phenotypes among closely related samples, as well as unbalanced phenotypic distribution.ResultsWe propose a new method for genetic association test with family disease history, mixed-model-based Test with Adjusted Phenotype and Empirical saddlepoint approximation, which controls for increased phenotype correlation by adopting a two-variance-component mixed model, accounts for case–control imbalance by using empirical saddlepoint approximation, and is flexible to incorporate any existing adjusted phenotypes, such as phenotypes from the LT-FH method. We show through simulation studies and analysis of UK Biobank data of white British samples and the Korean Genome and Epidemiology Study of Korean samples that the proposed method is robust and yields better calibration compared to existing methods while gaining power for detection of variant–phenotype associations.Availability and implementationThe summary statistics and code generated in this study are available at https://github.com/styvon/TAPE.Supplementary information Supplementary data are available at Bioinformatics online.

Full Text