Abstract

BackgroundAlong with the improvement of high throughput sequencing technologies, the genetics community is showing marked interest for the rare variants/common diseases hypothesis. While sequencing can still be prohibitive for large studies, commercially available genotyping arrays targeting rare variants prove to be a reasonable alternative. A technical challenge of array based methods is the task of deriving genotype classes (homozygous or heterozygous) by clustering intensity data points. The performance of clustering tools for common polymorphisms is well established, while their performance when conducted with a large proportion of rare variants (where data points are sparse for genotypes containing the rare allele) is less known. We have compared the performance of four clustering tools (GenCall, GenoSNP, optiCall and zCall) for the genotyping of over 10,000 samples using the Illumina’s HumanExome BeadChip, which includes 247,870 variants, 90% of which have a minor allele frequency below 5% in a population of European ancestry. Different reference parameters for GenCall and different initial parameters for GenoSNP were tested. Genotyping accuracy was assessed using data from the 1000 Genomes Project as a gold standard, and agreement between tools was measured.ResultsConcordance of GenoSNP’s calls with the gold standard was below expectations and was increased by changing the tool’s initial parameters. While the four tools provided concordance with the gold standard above 99% for common alleles, some of them performed poorly for rare alleles. The reproducibility of genotype calls for each tool was assessed using experimental duplicates which provided concordance rates above 99%. The inter-tool agreement of genotype calls was high for approximately 95% of variants. Most tools yielded similar error rates (approximately 0.02), except for zCall which performed better with a 0.00164 mean error rate.ConclusionsThe GenoSNP clustering tool could not be run straight “out of the box” with the HumanExome BeadChip, as modification of hard coded parameters was necessary to achieve optimal performance. Overall, GenCall marginally outperformed the other tools for the HumanExome BeadChip. The use of experimental replicates provided a valuable quality control tool for genotyping projects with rare variants.

Highlights

  • Along with the improvement of high throughput sequencing technologies, the genetics community is showing marked interest for the rare variants/common diseases hypothesis

  • Several commercial arrays include a large fraction of rare single nucleotide variants (SNV) discovered by high-throughput sequencing technologies

  • The Illumina HumanExome BeadChip provides a compromise between genotyping single nucleotide polymorphism (SNP) arrays and generation sequencing by enabling the genotyping of rare SNVs in thousands of samples at relatively low cost

Read more

Summary

Introduction

Along with the improvement of high throughput sequencing technologies, the genetics community is showing marked interest for the rare variants/common diseases hypothesis. We have compared the performance of four clustering tools (GenCall, GenoSNP, optiCall and zCall) for the genotyping of over 10,000 samples using the Illumina’s HumanExome BeadChip, which includes 247,870 variants, 90% of which have a minor allele frequency below 5% in a population of European ancestry. The majority of GWAS were conducted using common single nucleotide polymorphism (SNP) arrays targeting markers that were identified from the SNP arrays offer the possibility of rapid genotyping of thousands of samples with highly reliable results at low cost. Several commercial arrays include a large fraction of rare single nucleotide variants (SNV) discovered by high-throughput sequencing technologies. The latter, while still expensive compared to SNP arrays, allows for the discovery of all variants, rare and common, located in the genome of sequenced individuals. Compared to other genotyping platforms targeting millions of markers, the proportion of rare variants (minor allele frequency < 5%) included in the HumanExome BeadChip is considerably larger

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call