Abstract

Detecting single nucleotide polymorphisms (SNPs) interactions is crucial to identify susceptibility genes associated with complex human diseases in genome-wide association studies. Clustering-based approaches are widely used in reducing search space and exploring potential relationships between SNPs in epistasis analysis. However, these approaches all only use a single measure to filter out nonsignificant SNP combinations, which may be significant ones from another perspective. In this paper, we propose a two-stage approach named EpiMC (Epistatic Interactions detection based on Multiple Clusterings) that employs multiple clusterings to obtain more precise candidate sets and more comprehensively detect high-order interactions based on these sets. In the first stage, EpiMC proposes a matrix factorization based multiple clusterings algorithm to generate multiple diverse clusterings, each of which divide all SNPs into different clusters. This stage aims to reduce the chance of filtering out potential candidates overlooked by a single clustering and groups associated SNPs together from different clustering perspectives. In the next stage, EpiMC considers both the single-locus effects and interaction effects to select high-quality disease associated SNPs, and then uses Jaccard similarity to get candidate sets. Finally, EpiMC uses exhaustive search on the obtained small candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpiMC has a better performance in detecting high-order interactions than state-of-the-art solutions. On the Wellcome Trust Case Control Consortium (WTCCC) dataset, EpiMC detects several significant epistatic interactions associated with breast cancer (BC) and age-related macular degeneration (AMD), which again corroborate the effectiveness of EpiMC.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.