Abstract
BackgroundHigh throughput microarray-based single nucleotide polymorphism (SNP) genotyping has revolutionized the way genome-wide linkage scans and association analyses are performed. One of the key features of the array-based GeneChip® Mapping 10K Array from Affymetrix is the automated SNP calling algorithm. The Affymetrix algorithm was trained on a database of ethnically diverse DNA samples to create SNP call zones that are used as static models to make genotype calls for experimental data. We describe here the implementation of clustering algorithms on large training datasets resulting in improved SNP call rates on the 10K GeneChip.ResultsA database of 948 individuals genotyped on the GeneChip® Mapping 10K 2.0 Array was used to identify 822 SNPs that were called consistently less than 75% of the time. These SNPs represent on average 8.25% of the total SNPs on each chromosome with chromosome 19, the most gene-rich chromosome, containing the highest proportion of poor performers (18.7%). To remedy this, we created SNiPer, a new application which uses two clustering algorithms to yield increased call rates and equivalent concordance to Affymetrix called genotypes. We include a training set for these algorithms based on individual genotypes for 705 samples. SNiPer has the capability to be retrained for lab-specific training sets. SNiPer is freely available for download at .ConclusionThe correct calling of poor performing SNPs may prove to be key in future linkage studies performed on the 10K GeneChip. It would prove particularly invaluable for those diseases that map to chromosome 19, known to contain a high proportion of poorly performing SNPs. Our results illustrate that SNiPer can be used to increase call rates on the 10K GeneChip® without sacrificing accuracy, thereby increasing the amount of valid data generated.
Highlights
High throughput microarray-based single nucleotide polymorphism (SNP) genotyping has revolutionized the way genome-wide linkage scans and association analyses are performed
The GeneChip® Mapping Array relies on the hybridization of biotin-tagged fragments of SNP-containing DNA to complementary DNA oligomers chemically tiled on a silicon wafer in order to genotype 10,204 SNPs with a mean inter-marker spacing of 258 Kb [7]
Identification and characterization of poorly behaving SNPs on the 10K GeneChip® In order to identify those SNPs that frequently result in a "NoCall" on the 10K GeneChip® we compiled a database of 948 individuals that were genotyped in the last two months in our laboratory
Summary
High throughput microarray-based single nucleotide polymorphism (SNP) genotyping has revolutionized the way genome-wide linkage scans and association analyses are performed. Single nucleotide polymorphisms (SNPs) are fast becoming the markers of choice for genome-wide linkage scans, loss of heterozygosity (LOH), comparative genomic hybridization (CGH) and whole-genome association studies [1]. This is due to the existence of high throughput technologies like the GeneChip® Human Mapping Array from Affymetrix coupled with the abundant and uniform distribution of SNPs throughout the human genome [26]. The assay utilizes a relatively minor amount of genomic DNA (250 ng) and a series of reactions called fragment selection by PCR (FSP). The PCR products are digested to a size of ~50 bp with DNase I, end-labeled with biotin, and hybridized to the microarray wafer
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.