Abstract

Most positions of the human genome are typically invariant (99%) and only some positions (1%) are commonly variant which are associated with complex genetic diseases. Haplotype reconstruction is to divide aligned SNP fragments, which is the most frequent form of difference to address genetic diseases, into two classes, and thus inferring a pair of haplotypes from them. Minimum error correction (MEC) is an important model for this problem but only effective when the error rate of the fragments is low. MEC/GI as an extension to MEC employs the related genotype information besides the SNP fragments and so results in a more accurate inference. The haplotyping problem, due to its NP-hardness, may have no efficient algorithm for exact solution. In this paper, three heuristic clustering methods based on MEC and MEC/GI model are presented. As numerical results on real biological data and simulation data show, the clustering algorithms work well and an increase in the rate of similarity between the real haplotypes and the reconstructed ones is gained.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call