Abstract

The authors present ELB, an easy to programme and computationally fast algorithm for inferring gametic phase in population samples of multilocus genotypes. Phase updates are made on the basis of a window of neighbouring loci, and the window size varies according to the local level of linkage disequilibrium. Thus, ELB is particularly well suited to problems involving many loci and/or relatively large genomic regions, including those with variable recombination rate. The authors have simulated population samples of single nucleotide polymorphism genotypes with varying levels of recombination and marker density, and find that ELB provides better local estimation of gametic phase than the PHASE or HTYPER programs, while its global accuracy is broadly similar. The relative improvement in local accuracy increases both with increasing recombination and with increasing marker density. Short tandem repeat (STR, or microsatellite) simulation studies demonstrate ELB's superiority over PHASE both globally and locally. Missing data are handled by ELB; simulations show that phase recovery is virtually unaffected by up to 2 per cent of missing data, but that phase estimation is noticeably impaired beyond this amount. The authors also applied ELB to datasets obtained from random pairings of 42 human X chromosomes typed at 97 diallelic markers in a 200 kb low-recombination region. Once again, they found ELB to have consistently better local accuracy than PHASE or HTYPER, while its global accuracy was close to the best.

Highlights

  • The human genome is highly polymorphic, with more than one heterozygous nucleotide per 500 sites.[1]

  • Over the past few years, it has become increasingly easy to document much of this polymorphism in population samples, using dense maps of single nucleotide polymorphism (SNP) or short tandem repeat (STR, or microsatellite) markers.[2]

  • The Excoffier– Laval–Balding (ELB) algorithm has been introduced for estimating gametic phase from multi-locus genotypes using a window that adapts to local levels of linkage disequilibrium (LD)

Read more

Summary

Introduction

The human genome is highly polymorphic, with more than one heterozygous nucleotide per 500 sites.[1] Over the past few years, it has become increasingly easy to document much of this polymorphism in population samples, using dense maps of single nucleotide polymorphism (SNP) or short tandem repeat (STR, or microsatellite) markers.[2] Applications of such data include assessing population structure and migration levels,[3,4] detecting selection and founder effects on disease alleles[5,6] and mapping genes associated with disease.[7]. Equivalent to genotype data plus gametic phase, is advantageous for many applications, such as linkage disequilibrium (LD) mapping,[8,9,10] even though the additional information content of haplotype over genotype data is not very large. The advantages of haplotype data arise because they are much more amenable to analysis, in large part because haplotype segments are inherited uniparentally.[11]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.