Abstract

Epistasis within disease-related genes (gene–gene interactions) was determined through contingency table measures based on multifactor dimensionality reduction (MDR) using single-nucleotide polymorphisms (SNPs). Most MDR-based methods use the single contingency table measure to detect gene–gene interactions; however, some gene–gene interactions may require identification through multiple contingency table measures. In this study, a multiobjective differential evolution method (called MODEMDR) was proposed to merge the various contingency table measures based on MDR to detect significant gene–gene interactions. Two contingency table measures, namely the correct classification rate and normalized mutual information, were selected to design the fitness functions in MODEMDR. The characteristics of multiobjective optimization enable MODEMDR to use multiple measures to efficiently and synchronously detect significant gene–gene interactions within a reasonable time frame. Epistatic models with and without marginal effects under various parameter settings (heritability and minor allele frequencies) were used to assess existing methods by comparing the detection success rates of gene–gene interactions. The results of the simulation datasets show that MODEMDR is superior to existing methods. Moreover, a large dataset obtained from the Wellcome Trust Case Control Consortium was used to assess MODEMDR. MODEMDR exhibited efficiency in identifying significant gene–gene interactions in genome-wide association studies.

Highlights

  • Single-nucleotide polymorphism (SNP) is a genetic variation of DNA sequences within a population

  • The eight epistatic models with marginal effects were used to evaluate the performance of SNPRuler, Multifactor-dimensionality reduction (MDR), DE MDR (DEMDR) (CCR), and MODEMDR

  • To evaluate the ability of MODEMDR to handle large datasets, a large dataset was obtained from the WTCCC24, consisting of 500,569 SNPs, including 1,988 cases of coronary artery disease (CAD) and 1,500 controls obtained from people living in Great Britain who self-identified as white Europeans

Read more

Summary

Introduction

Single-nucleotide polymorphism (SNP) is a genetic variation of DNA sequences within a population. MDR is a nonparametric data mining approach combining a contingency table measure [k-fold cross-validation (CV)] and a dimensionality reduction technique to detect gene–gene interactions in case–control studies. SNPRuler is a nonparametric learning approach based on a predictive rule learning algorithm for identifying gene–gene interactions. SNPRuler is a nonparametric learning approach based on a predictive rule learning algorithm for identifying gene–gene interactions11 These methods have been applied to detect significant gene–gene interactions and investigate the effects of drugs on breast cancer, oral cancer, hypertension, and other human diseases. Pareto optimal solution sets (Pareto sets) represent a powerful technique for collecting good solutions not dominated by one another. These good solutions are the results of MODE. Developing a method that can synchronously consider multiple measures to detect gene–gene interactions is essential

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call