Abstract

Unexplained genetic variation that causes complex diseases is often induced by gene-gene interactions (GGIs). Gene-based methods are one of the current statistical methodologies for discovering GGIs in case-control genome-wide association studies that are not only powerful statistically, but also interpretable biologically. However, most approaches include assumptions about the form of GGIs, which results in poor statistical performance. As a result, we propose gene-based testing based on the maximal neighborhood coefficient (MNC) called gene-based gene-gene interaction through a maximal neighborhood coefficient (GBMNC). MNC is a metric for capturing a wide range of relationships between two random vectors with arbitrary, but not necessarily equal, dimensions. We established a statistic that leverages the difference in MNC in case and in control samples as an indication of the existence of GGIs, based on the assumption that the joint distribution of two genes in cases and controls should not be substantially different if there is no interaction between them. We then used a permutation-based statistical test to evaluate this statistic and calculate a statistical p-value to represent the significance of the interaction. Experimental results using both simulation and real data showed that our approach outperformed earlier methods for detecting GGIs.

Highlights

  • Genome-wide association studies (GWAS) has been used to investigate the associations between genetic variants and complex disorders with great success

  • We propose a new approach called gene-based, gene-gene interaction through a maximal neighborhood coefficient (GBMNC), which uses the maximal neighborhood coefficient (MNC) (Cheng et al, 2020) to identify gene-gene interaction of complex diseases at the gene-level in case-control studies

  • We proposed a gene-based genegene interactions (GGIs) detection method called GBMNC based on a maximal neighborhood coefficient and a permutation strategy for case-control studies in GWAS

Read more

Summary

INTRODUCTION

Genome-wide association studies (GWAS) has been used to investigate the associations between genetic variants and complex disorders with great success. The effectiveness of gene-based methods in GWAS marginal association studies should be extended to the study of gene-gene interaction (GGIs) (Emily, 2018; Emily et al, 2020). This strategy offers a number of possible benefits. Gene-based methods are more powerful statistically because a gene carries more information than individual SNP and genes interact in a variety of ways (Liu et al, 2010; Li et al, 2011; Jiang et al, 2017; Su et al, 2019; Hu et al, 2020; Hu et al, 2021a; Hu et al, 2021b; Guo et al, 2021). Its application using real data sets showed accurate identification of GGIs

MATERIALS AND METHODS
Maximal Neighborhood Coefficient
Illustration of the GBMNC Workflow
Simulation Study
EXPERIMENTS USING RHEUMATOID ARTHRITIS DATA
Method Heritability
Method Sample size
CONCLUSION
DATA AVAILABILITY STATEMENT
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call