Abstract

BackgroundHaplotype analysis has gained increasing attention in the context of association studies of disease genes and drug responsivities over the last years. The potential use of haplotypes has led to the initiation of the HapMap project which is to investigate haplotype patterns in the human genome in different populations. Haplotype inference and frequency estimation are essential components of this endeavour.ResultsWe present a two-stage method to estimate haplotype frequencies in pedigrees, which includes haplotyping stage and estimation stage. In the haplotyping stage, we propose a linear time algorithm to determine all zero-recombinant haplotype configurations for each pedigree. In the estimation stage, we use the expectation-maximization (EM) algorithm to estimate haplotype frequencies based on these haplotype configurations. The experiments demonstrate that our method runs much faster and gives more credible estimates than other popular haplotype analysis software that discards the pedigree information.ConclusionOur method suggests that pedigree information is of great importance in haplotype analysis. It can be used to speedup estimation process, and to improve estimation accuracy as well. The result also demonstrates that the whole haplotype configuration space can be substituted by the space of zero-recombinant haplotype configurations in haplotype frequency estimation, especially when the considered haplotype block is relatively short.

Highlights

  • Haplotype analysis has gained increasing attention in the context of association studies of disease genes and drug responsivities over the last years

  • Haplotyping stage: haplotyping algorithm based on HCLlinkage analysis Excoffier's EM algorithm was widely applied in haplotype analysis [14,15]

  • The second data set is from the CEPH database [17], which contains 65 families; each consists of only three generations, usually with four grandparents, two parents and a number of children

Read more

Summary

Introduction

Haplotype analysis has gained increasing attention in the context of association studies of disease genes and drug responsivities over the last years. The potential use of haplotypes has led to the initiation of the HapMap project which is to investigate haplotype patterns in the human genome in different populations. The modelling of human genetic variation is critical to the understanding of the genetic basis for complex diseases. Single nucleotide polymorphisms (SNPs) are the most frequent form of variation. The Human Genome Project and other large-scale efforts have identified millions of SNP markers. It is useful to analyze haplotypes (haploid genotypes), which are sequences of linked markers on a single chromosome. In diploid organisms, such as human beings, chromosomes come in pairs, and experi-

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call