Abstract

BackgroundIllumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.ResultsIn general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics.ConclusionsCRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.

Highlights

  • Illumina’s Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies

  • The model-based clustering can occur within sample (GenoSNP) or between samples (GenCall, Illuminus, CRLMM)

  • The drop rate refers to the proportion of SNPs which have been removed from the accuracy calculation based on low call confidence measures

Read more

Summary

Introduction

Illumina’s Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. High-density SNP microarrays cataloguing variation identified in the HapMap project [2] have been the enabling technology behind these large-scale genome-wide association studies. These microarrays allow the collection of genotypes for many SNPs in many individuals at relatively low cost. A number of algorithms are available for processing the raw signal from these arrays into genotype calls These methods include: GenCall [5], Illumina’s proprietary method implemented in the BeadStudio/GenomeStudio software; Illuminus [6]; GenoSNP [7]; CRLMM [8,9,10]; Birdseed, available in the Birdsuite software [11]; and BeagleCall [12]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call