Abstract

MotivationTwo-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models.MethodIn this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models.ResultsWe investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset.

Highlights

  • With the advent of high-throughput sequencing technology, it is possible to measure all of single-nucleotide polymorphisms (SNPs) from thousands of individuals [1]

  • We investigate our method named fast harmony search algorithm (FHSA)-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and cuckoo search epistasis (CSE)) which have been developed recently based on swarm intelligent search algorithm

  • To investigate of FHSA-SED algorithm, we evaluated its performance using 82 simulation datasets with different type of disease models and compared its performance with two excellent intelligent optimization algorithms (MACOED, CSE)

Read more

Summary

Motivation

Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models

Method
Results
Introduction
Methods
Experiments and Results
Experiments on AMD real data
Discussion
Limitations
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call