Abstract

Among the various statistical methods for identifying gene–gene interactions in qualitative genome-wide association studies (GWAS), gene-based methods have recently grown in popularity because they confer advantages in both statistical power and biological interpretability. However, most of these methods make strong assumptions about the form of the relationship between traits and single-nucleotide polymorphisms, which result in limited statistical power. In this paper, we propose a gene-based method based on the distance correlation coefficient called gene-based gene-gene interaction via distance correlation coefficient (GBDcor). The distance correlation (dCor) is a measurement of the dependency between two random vectors with arbitrary, and not necessarily equal, dimensions. We used the difference in dCor in case and control datasets as an indicator of gene–gene interaction, which was based on the assumption that the joint distribution of two genes in case subjects and in control subjects should not be significantly different if the two genes do not interact. We designed a permutation-based statistical test to evaluate the difference between dCor in cases and controls for a pair of genes, and we provided the p-value for the statistic to represent the significance of the interaction between the two genes. In experiments with both simulated and real-world data, our method outperformed previous approaches in detecting interactions accurately.

Highlights

  • Genome-wide association studies (GWAS) are a well-established and effective method of identifying genetic loci associated with common diseases or traits, and they have identified over 65,000 unique single-nucleotide polymorphisms (SNPs) that are associated with various traits or diseases [1,2,3,4,5]

  • To assess the capacity of GBDcor to deal with real gene–gene interaction of a case-control dataset, we investigated the susceptibility of a set of pair of genes in rheumatoid arthritis (RA), which is a chronic, autoimmune joint disease where persistent inflammation affects bone remodeling and results in progressive bone destruction

  • GBDcor that was based on distance correlation coefficients and a permutation strategy for GWAS on case-control datasets

Read more

Summary

Introduction

Genome-wide association studies (GWAS) are a well-established and effective method of identifying genetic loci associated with common diseases or traits, and they have identified over 65,000 unique single-nucleotide polymorphisms (SNPs) that are associated with various traits or diseases [1,2,3,4,5]. GWAS analysis strategies were based largely on single-locus models, which tested the association between individual markers and a given phenotype independently. This type of approach has identified many regions of disease susceptibility successfully, most of the SNPs that have been identified have small effect sizes that failed to account fully for the heritability of complex traits. Other techniques that have been used to study SNP–SNP interactions include multifactor dimensionality reduction (MDR) [20], Tuned RelieF (TuRF) [21], Bayesian epistasis association mapping (BEAM) [6], Tree-based epistasis association mapping (TEAM) [22], Boolean operation-based screening and testing (BOOST) [23], and permutation-based Random Forest (pRF) [24]

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call