Abstract

One limitation of the existing tagging SNP selection algorithms is that they assume the reported genotypes are error free. However, genotyping errors are often unavoidable in practice. Many tagging SNP selection methods depend heavily on the estimated haplotype frequencies. Recent studies have demonstrated that even slight genotyping errors can lead to serious consequences with regard to haplotype reconstruction and frequency estimation. Here we present a tagging SNP selection method that allows for genotyping errors. Our method is a modification of the pair-wise r(2) tagging SNP selection algorithm proposed by Carlson et al. (2004). We have replaced the standard EM algorithm in Carlson's method with an EM that accounts for genotyping errors, in an attempt to obtain better estimates of the haplotype frequencies and r(2) measure. Through simulation studies we compared the performance of our modified algorithm with that of the original algorithm. We found that the number of tags selected by both methods increased with increasing genotyping errors, though our method led to smaller increase. The power of haplotype association tests using the selected tags decreased dramatically with increasing genotyping errors. The power of single marker tests also decreased, but the reduction was not as much as the reduction in power of haplotype tests. When restricting the mean number of tags selected by both methods to be similar to the baseline number, Carlson's method and our method led to similar power for the subsequent haplotype and single marker tests. Our results showed that, by accounting for random genotyping errors, our method can select tagging SNPs more efficiently than Carlson's method. The computer program that implements our modified tagging SNP selection algorithm is available at our web site: http://www.personal.psu.edu/tuy104/.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call