Abstract

The problem of haplotype inference under the Mendelian law of inheritance on pedigree genotype data is studied. The minimum recombination principle states that genetic recombinations are rare and haplotypes with fewer recombinations are more likely to exist. Given genotype data on a pedigree, the problem of Minimum Recombination Haplotype Inference (MRHI) is to find a set of haplotype configurations consistent with the genotype data having the minimum number of recombinations. In this paper, we focus on a variation of the MRHI problem that gives more realistic solutions, namely the k-MRHI problem, which has the additional constraint that the number of recombinations in each parent-offspring pair is at most k. Although the k-MRHI problem is NP-hard even for k = 1, the k-MRHI problem with k > 1 can be solved efficiently by dynamic programming in \(O(nm^{3k+1}_{0}2^{m0})\) time by adopting an approach similar to the one used by Doi, Li and Jiang on pedigrees with n nodes and at most m 0 heterozygous loci in each node. In particular, the 1-MRHI problem can be solved in \(O(nm^{4}_{0}2^{m0})\) time. We propose an O(n 2 m 0) algorithm to find a node as the root of the pedigree tree so as to further reduce the time complexity to O(m 0 min(t R )), where t R is the number of feasible haplotype configuration combinations in all trios in the pedigree tree when R is the root. If the pedigree has few generations, the 1-MRHI problem can be solved in \(O(min\{nm^{4}_{0}2^{m0}, nm^{l+1}_{0}2^{\mu R+l}\})\) time, where μ R is the number of heterozygous loci in the root, and l is the maximum path length from the root to the leaves in the pedigree tree. Experiments on both real and simulated data confirm the out-performance of our algorithm when compared with other popular algorithms. In most real cases, our algorithm gives the same haplotyping results but runs much faster. In some real cases, other algorithms give an answer which has the least number of recombinations, while our algorithm gives a more credible solution and runs faster.KeywordsTime ComplexityNuclear FamilyDynamic Programming AlgorithmHeterozygous LocusHaplotype InferenceThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.