Genome-Wide Gene-Based Multi-Trait Analysis.

Yamin Deng,Shaoyu Li,Ruiling Fang,Yuehua Cui,Hongyan Cao,Tao He

doi:10.3389/fgene.2020.00437

Abstract

Genome-wide association studies focusing on a single phenotype have been broadly conducted to identify genetic variants associated with a complex disease. The commonly applied single variant analysis is limited by failing to consider the complex interactions between variants, which motivated the development of association analyses focusing on genes or gene sets. Moreover, when multiple correlated phenotypes are available, methods based on a multi-trait analysis can improve the association power. However, most currently available multi-trait analyses are single variant-based analyses; thus have limited power when disease variants function as a group in a gene or a gene set. In this work, we propose a genome-wide gene-based multi-trait analysis method by considering genes as testing units. For a given phenotype, we adopt a rapid and powerful kernel-based testing method which can evaluate the joint effect of multiple variants within a gene. The joint effect, either linear or nonlinear, is captured through kernel functions. Given a series of candidate kernel functions, we propose an omnibus test strategy to integrate the test results based on different candidate kernels. A p-value combination method is then applied to integrate dependent p-values to assess the association between a gene and multiple correlated phenotypes. Simulation studies show a reasonable type I error control and an excellent power of the proposed method compared to its counterparts. We further show the utility of the method by applying it to two data sets: the Human Liver Cohort and the Alzheimer Disease Neuroimaging Initiative data set, and novel genes are identified. Our method has broad applications in other fields in which the interest is to evaluate the joint effect (linear or nonlinear) of a set of variants.

Highlights

Methods on genome-wide association studies (GWAS) are mostly focused on single variant analysis with a single phenotype, the so-called single-variant single-trait analysis
We demonstrate the performance of our proposed method through two real data applications of the Human Liver Cohort (HLC) study and the Alzheimer Disease Neuroimaging Initiative (ADNI) study
The power decreases as the SNP dimension increases for all the three methods, the power decrease is more dramatic for RMMLR and multivariate analysis of variance (MANOVA) compared to that for Omnibus Multi-trait Gene-based Association (OMGA)

Summary

Introduction

Methods on genome-wide association studies (GWAS) are mostly focused on single variant (e.g., single nucleotide polymorphism, SNP) analysis with a single phenotype, the so-called single-variant single-trait analysis. Gene-Based Multi-Trait Analysis gain association power by aggregating multiple weak signals (He et al, 2013; Schifano et al, 2013; Wang, 2014) and lead to better understanding of disease etiology by detecting genetic variants with pleiotropic effects (Amos and Laing, 1993; Jiang and Zeng, 1995; Schifano et al, 2013). Methods focusing on summary statistics have gained much popularity recently since the individual-level data are typically unavailable (e.g., Kim et al, 2015; Turley et al, 2018) Such methods are largely undermined if the published GWAS summary statistics have limited accuracy. The marginal SNP effect is usually quite small in many complex diseases, and many identified SNPs have limited biological interpretation, for example, SNPs identified in non-coding regions

Methods

Results

Conclusion