Epistasis helps to explain how multiple single-nucleotide polymorphisms (SNPs) interact to cause disease. A variety of tools have been developed to detect epistasis. In this article, we explore the strengths and weaknesses of an information theory approach for detecting epistasis and compare it to the logistic regression approach through simulations. We consider several scenarios to simulate the involvement of SNPs in an epistasis network with respect to linkage disequilibrium patterns among them and the presence or absence of main and interaction effects. We conclude that the information theory approach more efficiently detects interaction effects when main effects are absent, whereas, in general, the logistic regression approach is appropriate in all scenarios but results in higher false positives. We compute epistasis networks for SNPs in the FSD1L gene using a two-phase head and neck cancer genome-wide association study involving 2,185 cases and 4,507 controls to demonstrate the practical application of the methods.
Read full abstract