Abstract

Detecting high-order epistasis in genome-wide association studies (GWASs) is of importance when characterizing complex human diseases. However, the enormous numbers of possible single-nucleotide polymorphism (SNP) combinations and the diversity among diseases presents a significant computational challenge. Herein, a fast method for detecting high-order epistasis based on an interaction weight (FDHE-IW) method is evaluated in the detection of SNP combinations associated with disease. First, the symmetrical uncertainty (SU) value for each SNP is calculated. Then, the top-k SNPs are isolated as guiders to identify 2-way SNP combinations with significant interaction weight values. Next, a forward search is employed to detect high-order SNP combinations with significant interaction weight values as candidates. Finally, the findings were statistically evaluated using a G-test to isolate true positives. The developed algorithm was used to evaluate 12 simulated datasets and an age-related macular degeneration (AMD) dataset and was shown to perform robustly in the detection of some high-order disease-causing models.

Highlights

  • In recent years, genome-wide association studies (GWASs) have played an important role in identifying single-nucleotide polymorphisms (SNPs) associated with complex human diseases.This approach is non-candidate-driven, and investigates the entire genome, offering a more comprehensive method when compared to gene-specific candidate-driven studies [1]

  • The joint entropy (JE) is a measure of the uncertainty that is associated with a set of variables and can be used to measure the genotype distribution of a k-way SNP combination ( X1, X2, · · ·, Xk ); it cannot be used in assessing genotype–phenotype correlations

  • The detection power of FDHE-interaction weight (IW) was first investigated by comparing it with four state-of-the-art algorithms (BEAM, SNPHarvester, multi-objective ant colony optimization epistasis detection (MACOED), and BOOST) using a disease loci with marginal effects (DME) dataset with 100 SNPs (Figure 2) and a data set with 1000 SNPs (Figure 3)

Read more

Summary

Introduction

Genome-wide association studies (GWASs) have played an important role in identifying single-nucleotide polymorphisms (SNPs) associated with complex human diseases. It is challenging to develop a method that is able to reliably identify disease-causing SNP combinations from those that are not given the diversity that exists among disease models [6], especially when there is insufficient sample data To tackle these challenges, some algorithms were developed to detect synergistic SNP combinations associated with complex diseases. Epistasis multi-objective optimization utilizes exhaustive methods to evaluate all SNP combinations using mutual entropy and a Bayesian network It is not feasible for high-order epistasis detection, due to the enormous computational burden. Have attracted attention when detecting high-order epistatic interactions, due to a reduced computational burden, which is due to not all SNP combinations being examined These algorithms are often sensitive to parameters, and trapped in local searches [23,24].

Methods
Performance Evaluation
Simulation Data Sets and Case Study
Simulated Models
(Tables and
(Tables
Experimental Results Using an AMD Dataset
Discussion
Advantage
Limitations
Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call