Abstract

Determining the complex relationships between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has been proven to be capable of effectively detecting the statistical patterns of epistasis, although classification accuracy is required for this approach. The imbalanced dataset can cause seriously negative effects on classification accuracy. Moreover, MDR methods cannot quantitatively assess the disease risk of genotype combinations. Hence, we introduce a novel weighted risk score-based multifactor dimensionality reduction (WRSMDR) method that uses the Bayesian posterior probability of polymorphism combinations as a new quantitative measure of disease risk. First, we compared the WRSMDR to the MDR method in simulated datasets. Our results showed that the WRSMDR method had reasonable power to identify high-order gene-gene interactions, and it was more effective than MDR at detecting four-locus models. Moreover, WRSMDR reveals more information regarding the effect of genotype combination on the disease risk, and the result was easier to determine and apply than with MDR. Finally, we applied WRSMDR to a nasopharyngeal carcinoma (NPC) case-control study and identified a statistically significant high-order interaction among three polymorphisms: rs2860580, rs11865086 and rs2305806.

Highlights

  • Complex interactions among genes and environmental factors are known to play a role in common human disease etiology

  • We introduced weighted risk score-based multifactor dimensionality reduction (WRSMDR) as a method for detecting gene-gene interactions in case-control studies

  • Our result showed that the WRSMDR method had reasonable power to identify high-order interactions in simulated datasets

Read more

Summary

Introduction

Complex interactions among genes and environmental factors are known to play a role in common human disease etiology. Since its initial description by Ritchie [5], many modifications and extensions to the MDR approach have been proposed These include entropy-based interpretation methods [9], the use of odds ratios [13], log-linear methods [14], generalized linear models [15], methods for imbalanced data [16], permutation testing methods [17,18], methods for addressing missing data [19], parallel implementations [20,21], different evaluation metrics [22,23], methods for quantitative traits [24], balancing function methods [25] and the aggregated-multifactor dimensionality reduction method [26]. We introduce a novel weighted risk score-based multifactor dimensionality reduction (WRSMDR) method for detecting and characterizing high-order gene-gene interactions in case-control studies This WRSMDR method uses the Bayesian posterior probability of each genotype combination as a quantitative measure of disease risk and computes the proportion of each genotype combinations in all samples as the weight. We applied the WRSMDR method to identify multiple single-nucleotide polymorphisms (SNP) associated with nasopharyngeal carcinoma

Results and Discussion
Comparison of WRSMDR with MDR
Application of WRSMDR to NPC Data
The Advantages and Limitations of WRSMDR
WRSMDR
Data Simulation
NPC Data
Data Analysis
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call