Abstract
Detecting high-order epistasis in genome-wide association studies (GWASs) is of importance when characterizing complex human diseases. However, the enormous numbers of possible single-nucleotide polymorphism (SNP) combinations and the diversity among diseases presents a significant computational challenge. Herein, a fast method for detecting high-order epistasis based on an interaction weight (FDHE-IW) method is evaluated in the detection of SNP combinations associated with disease. First, the symmetrical uncertainty (SU) value for each SNP is calculated. Then, the top-k SNPs are isolated as guiders to identify 2-way SNP combinations with significant interaction weight values. Next, a forward search is employed to detect high-order SNP combinations with significant interaction weight values as candidates. Finally, the findings were statistically evaluated using a G-test to isolate true positives. The developed algorithm was used to evaluate 12 simulated datasets and an age-related macular degeneration (AMD) dataset and was shown to perform robustly in the detection of some high-order disease-causing models.
Highlights
In recent years, genome-wide association studies (GWASs) have played an important role in identifying single-nucleotide polymorphisms (SNPs) associated with complex human diseases.This approach is non-candidate-driven, and investigates the entire genome, offering a more comprehensive method when compared to gene-specific candidate-driven studies [1]
The joint entropy (JE) is a measure of the uncertainty that is associated with a set of variables and can be used to measure the genotype distribution of a k-way SNP combination ( X1, X2, · · ·, Xk ); it cannot be used in assessing genotype–phenotype correlations
The detection power of FDHE-interaction weight (IW) was first investigated by comparing it with four state-of-the-art algorithms (BEAM, SNPHarvester, multi-objective ant colony optimization epistasis detection (MACOED), and BOOST) using a disease loci with marginal effects (DME) dataset with 100 SNPs (Figure 2) and a data set with 1000 SNPs (Figure 3)
Summary
Genome-wide association studies (GWASs) have played an important role in identifying single-nucleotide polymorphisms (SNPs) associated with complex human diseases. It is challenging to develop a method that is able to reliably identify disease-causing SNP combinations from those that are not given the diversity that exists among disease models [6], especially when there is insufficient sample data To tackle these challenges, some algorithms were developed to detect synergistic SNP combinations associated with complex diseases. Epistasis multi-objective optimization utilizes exhaustive methods to evaluate all SNP combinations using mutual entropy and a Bayesian network It is not feasible for high-order epistasis detection, due to the enormous computational burden. Have attracted attention when detecting high-order epistatic interactions, due to a reduced computational burden, which is due to not all SNP combinations being examined These algorithms are often sensitive to parameters, and trapped in local searches [23,24].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have