Abstract

BackgroundNumerous approaches have been proposed for the detection of epistatic interactions within GWAS datasets in order to better understand the drivers of disease and genetics.MethodsA selection of state-of-the-art approaches were assessed. These included the statistical tests, fast-epistasis, BOOST, logistic regression and wtest; swarm intelligence methods, namely AntEpiSeeker, epiACO and CINOEDV; and data mining approaches, including MDR, GSS, SNPRuler and MPI3SNP. Data were simulated to provide randomly generated models with no individual main effects at different heritabilities (pure epistasis) as well as models based on penetrance tables with some main effects (impure epistasis). Detection of both two and three locus interactions were assessed across a total of 1,560 simulated datasets. The different methods were also applied to a section of the UK biobank cohort for Atrial Fibrillation.ResultsFor pure, two locus interactions, PLINK’s implementation of BOOST recovered the highest number of correct interactions, with 53.9% and significantly better performing than the other methods (p = 4.52e − 36). For impure two locus interactions, MDR exhibited the best performance, recovering 62.2% of the most significant impure epistatic interactions (p = 6.31e − 90 for all but one test). The assessment of three locus interaction prediction revealed that wtest recovered the highest number (17.2%) of pure epistatic interactions(p = 8.49e − 14). wtest also recovered the highest number of three locus impure epistatic interactions (p = 6.76e − 48) while AntEpiSeeker ranked as the most significant the highest number of such interactions (40.5%). Finally, when applied to a real dataset for Atrial Fibrillation, most notably finding an interaction between SYNE2 and DTNB.

Highlights

  • To gain a better understanding of the underlying mechanisms that govern disease pathophysiology and pathobiology, genetic studies have been carried out at increasing volume and across large populations [1, 2]

  • Numerous approaches have been proposed for the detection of epistatic interactions within Genome-Wide Association Studies (GWAS) datasets in order to better understand the drivers of disease and genetics

  • The assessment of three locus interaction prediction revealed that wtest recovered the highest number (17.2%) of pure epistatic interactions(p = 8.49e − 14)

Read more

Summary

Introduction

To gain a better understanding of the underlying mechanisms that govern disease pathophysiology and pathobiology, genetic studies have been carried out at increasing volume and across large populations [1, 2]. The initial hopes of Genome-Wide Association Studies (GWAS) contributing to major breakthroughs in our understanding of disease mechanisms failed to materialize. This was due to our inability to reliably identify genetic drivers, commonly referred to as the missing heritability challenge. Similar approaches have been successful in recapturing much higher heritability estimates in twin studies [6] Possible explanations for this shortfall have been centred around three general domains—aetiology being driven by a wide number of genes and variants, with many not captured or deemed significant, causative substitutions, indels or structural variations not being identified by the study design and the effects of epistatic genetic interactions not being identified [7, 8]. Numerous approaches have been proposed for the detection of epistatic interactions within GWAS datasets in order to better understand the drivers of disease and genetics.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call