Abstract

Several methods to identify tagging single-nucleotide polymorphisms (SNPs) are in common use for genetic epidemiologic studies; however, there may be loss of information when using only a subset of SNPs. We sought to compare the ability of commonly used pairwise, multimarker, and haplotype-based tagging SNP selection methods to detect known associations with quantitative expression phenotypes. Using data from HapMap release 21 on unrelated Utah residents with ancestors from northern and western Europe (CEPH-Utah, CEU), we selected tagging SNPs in five chromosomal regions using ldSelect, Tagger, and TagSNPs. We found that SNP subsets did not substantially overlap, and that the use of trio data did not greatly impact SNP selection. We then tested associations between HapMap genotypes and expression phenotypes on 28 CEU individuals as part of Genetic Analysis Workshop 15. Relative to the use of all SNPs (n = 210 SNPs across all regions), most subset methods were able to detect single-SNP and haplotype associations. Generally, pairwise selection approaches worked extremely well, relative to use of all SNPs, with marked reductions in the number of SNPs required. Haplotype-based approaches, which had identified smaller SNP subsets, missed associations in some regions. We conclude that the optimal tagging SNP method depends on the true model of the genetic association (i.e., whether a SNP or haplotype is responsible); unfortunately, this is often unknown at the time of SNP selection. Additional evaluations using empirical and simulated data are needed.

Highlights

  • Development and application of methods using linkagedisequilibrium (LD) for single-nucleotide polymorphism (SNP) selection has empowered genetic epidemiologic studies

  • Tagging single-nucleotide polymorphisms (SNPs) selection is implemented in commonly used, publicly available software packages that assess data from unrelated individuals or small families. ldSelect [4] performs pairwise selection using a binning algorithm, Tagger [5] selects SNPs using pairwise and multimarker methods and allows for inclusion of trio data to reduce phase uncertainty, and TagSNPs v. 2.0-beta [6] implements pairwise, multimarker, and haplotype methods allowing for the inclusion of trio data

  • The LRAP region included the most HapMap SNPs (n = 72, Table 1) and had strong linkage disequilibrium (LD); the HLA-DRB2 region had a large number of SNPs and low LD; the AA827892 region included only 16 SNPs in strong LD; and the CPNE1 and CSTB regions were of intermediate size with modest/variable LD

Read more

Summary

Introduction

Development and application of methods using linkagedisequilibrium (LD) for single-nucleotide polymorphism (SNP) selection has empowered genetic epidemiologic studies. SNP redundancy can be reduced, allowing for improved information/coverage within the constraints of a fixed budget. Three classes of tagging SNP methods have the following aims: 1) correlate each SNP of interest with a genotyped SNP (pairwise methods), 2) correlate each SNP of interest with a genotyped SNP or a combination of genotyped SNPs (multimarker methods), or 3) explain each haplotype of interest using a set of genotyped SNPs (haplotype-based methods). Tagging SNP selection is implemented in commonly used, publicly available software packages that assess data from unrelated individuals (founders) or small families (trios). LdSelect [4] performs pairwise selection using a binning algorithm, Tagger [5] selects SNPs using pairwise and multimarker methods and allows for inclusion of trio data to reduce phase uncertainty, and TagSNPs v. 2.0-beta [6] implements pairwise, multimarker, and haplotype methods allowing for the inclusion of trio data Tagging SNP selection is implemented in commonly used, publicly available software packages that assess data from unrelated individuals (founders) or small families (trios). ldSelect [4] performs pairwise selection using a binning algorithm, Tagger [5] selects SNPs using pairwise and multimarker methods and allows for inclusion of trio data to reduce phase uncertainty, and TagSNPs v. 2.0-beta [6] implements pairwise, multimarker, and haplotype methods allowing for the inclusion of trio data

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call