Abstract

Since the development of next generation sequencing (NGS) technology, researchers have been extending their efforts on genome-wide association studies (GWAS) from common variants to rare variants to find the missing inheritance. Although various statistical methods have been proposed to analyze rare variants data, they generally face difficulties for complex disease models involving multiple genes. In this paper, we propose a tree-based analysis of rare variants (TARV) that adopts a nonparametric disease model and is capable of exploring gene-gene interactions. We found that TARV outperforms the sequence kernel association test (SKAT) in most of our simulation scenarios, and by notable margins in some cases. By applying TARV to the study of addiction: genetics and environment (SAGE) data, we successfully detected gene CTNNA2 and its 43 specific variants that increase the risk of alcoholism in women, with an odds ratio (OR) of 1.94. This gene has not been detected in the SAGE data. Post hoc literature search also supports the role of CTNNA2 as a likely risk gene for alcohol addiction. In addition, we also detected a plausible protective gene CNTNAP2, whose 97 rare variants can reduce the risk of alcoholism in women, with an OR of 0.55. These findings suggest that TARV can be effective in dissecting genetic variants for complex diseases using rare variants data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call