Abstract
Haplotype assembly is to reconstruct a pair of haplotypes from SNP values observed in a set of individual DNA fragments. In this paper, we focus on studying minimum error correction (MEC) model for the haplotype assembly problem and explore self-organizing map (SOM) methods for this problem. Specifically, haplotype assembly by MEC is formulated into an integer linear programming model. Since the MEC problem is NP-hard and thus cannot be solved exactly within acceptable running time for large-scale instances, we investigate the ability of classical SOMs to solve the haplotype assembly problem with MEC model. Then, aiming to overcome the limits of classical SOMs, a novel SOM approach is proposed for the problem. Extensive computational experiments on both synthesized and real datasets show that the new SOM-based algorithm can efficiently reconstruct haplotype pairs in a very high accuracy under realistic parameter settings. Comparison with previous methods also confirms the superior performance of the new SOM approach.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have