Haplotype inference by pure parsimony (Hipp) is a well-known paradigm for haplotype inference. In order to assess the biological significance of this paradigm, we generalize the problem of Hipp to the problem of finding all optimal solutions, which we call Chipp. We study intrinsic haplotype features, such as backbone haplotypes and fat genotypes as well as equal columns and decomposability. We explicitly exploit these features in three computational approaches that are based on integer linear programming, depth-first branch-and-bound, and Boolean satisfiability. Further we introduce two hybrid algorithms that draw upon the diverse strengths of the approaches. Our experimental analysis shows that our optimized algorithms are significantly superior to the baseline algorithms, often with orders of magnitude faster running time. Finally, our experiments provide some useful insights into the intrinsic features of this important problem.
Read full abstract