Abstract

It is widely agreed that complex diseases are typically caused by the joint effects of multiple instead of a single genetic variation. These genetic variations may show stronger effects when considered together than when considered individually, a phenomenon known as epistasis or multilocus interaction. In this work, we explore the applicability of information interaction to discover pairwise epistatic effects related with complex diseases. We start by showing that traditional approaches such as classification methods or greedy feature selection methods (such as the Fleuret method) do not perform well on this problem. We then compare our information interaction method with BEAM and SNPHarvester in artificial datasets simulating epistatic interactions and show that our method is more powerful to detect pairwise epistatic interactions than its competitors. We show results of the application of information interaction method to the WTCCC breast cancer dataset. Our results are validated using permutation tests. We were able to find 89 statistically significant pairwise interactions with a p-value lower than . Even though many recent algorithms have been designed to find epistasis with low marginals, we observed that all (except one) of the SNPs involved in statistically significant interactions have moderate or high marginals. We also report that the interactions found in this work were not present in gene-gene interaction network STRING.

Highlights

  • The availability of ever more extensive genetic information has spurred intense research on the search for the genetic factors that influence common complex traits

  • We show in our paper that it is possible to apply information interaction to the Wellcome Trust Case Control Consortium (WTCCC) breast cancer dataset without any filtering step

  • We adapted the source code from the Fleuret method in order to calculate information interaction over all possible pairs of Single Nucleotide Polymorphisms (SNPs). With this approach we benefited from the efficient calculations of conditional mutual information that was already developed

Read more

Summary

Introduction

The availability of ever more extensive genetic information has spurred intense research on the search for the genetic factors that influence common complex traits. Due to limitations on the data, these analyses are usually performed using single SNP statistical tests and correcting for multiple testing. This approach has severe limitations since epistatic interactions of SNPs are very important in determining susceptibility to complex diseases. Existing methods for SNP interaction discovery perform poorly when marginal effects of disease loci are weak or absent. The problem is that the individual effects of the interacting SNPs may be too small to be detected with the most commonly used statistical methods. There is a need for more powerful methods that are able to identify interactions between SNPs with low marginal effects

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.