Abstract

In genome-wide association studies (GWAS) for thousands of phenotypes in biobanks, most binary phenotypes have substantially fewer cases than controls. Many widely used approaches for joint analysis of multiple phenotypes in association studies produce inflated type I error rates for such extremely unbalanced case-control phenotypes. In our research, we develop two novel methods to jointly analyze multiple unbalanced case-control phenotypes to circumvent this issue. In the first method, we cluster multiple phenotypes into different clusters based on a hierarchical clustering method, then we merge phenotypes in each cluster into a single phenotype. In each cluster, we use the saddlepoint approximation to estimate the p-value of an association test between the merged phenotype and a SNP which eliminates the issue of inflated type I error rate of the test for extremely unbalanced case-control phenotypes. Finally, we use the Cauchy combination method to obtain an integrated p-value for all clusters to test the association between multiple phenotypes and a SNP. In the second method, we first construct a Multi-Layer Network (MLN) using all individuals with at least one case status among all phenotypes. Then, we introduce a computational efficient community detection method to group phenotypes into different disjoint clusters based on the MLN. The phenotypes in the same cluster are merged to a single phenotype which mainly eliminates the issue of inflated type I error rate of test for extremely unbalanced binary phenotypes. Finally, to test the association between all phenotypes and a SNP, we use the score test statistic to test the association between each merged phenotype and a SNP and then use the Omnibus test to obtain an overall p-value (MLN-O). Extensive simulation studies reveal that the newly proposed approaches can control type I error rates and are more powerful than other methods we compared with. The real data analyses also show that our methods outperform other methods we compared with.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call