Abstract
BackgroundDNA sequence diversity within the human genome may be more greatly affected by copy number variations (CNVs) than single nucleotide polymorphisms (SNPs). Although the importance of CNVs in genome wide association studies (GWAS) is becoming widely accepted, the optimal methods for identifying these variants are still under evaluation. We have previously reported a comprehensive view of CNVs in the HapMap DNA collection using high density 500 K EA (Early Access) SNP genotyping arrays which revealed greater than 1,000 CNVs ranging in size from 1 kb to over 3 Mb. Although the arrays used most commonly for GWAS predominantly interrogate SNPs, CNV identification and detection does not necessarily require the use of DNA probes centered on polymorphic nucleotides and may even be hindered by the dependence on a successful SNP genotyping assay.ResultsIn this study, we have designed and evaluated a high density array predicated on the use of non-polymorphic oligonucleotide probes for CNV detection. This approach effectively uncouples copy number detection from SNP genotyping and thus has the potential to significantly improve probe coverage for genome-wide CNV identification. This array, in conjunction with PCR-based, complexity-reduced DNA target, queries over 1.3 M independent NspI restriction enzyme fragments in the 200 bp to 1100 bp size range, which is a several fold increase in marker density as compared to the 500 K EA array. In addition, a novel algorithm was developed and validated to extract CNV regions and boundaries.ConclusionUsing a well-characterized pair of DNA samples, close to 200 CNVs were identified, of which nearly 50% appear novel yet were independently validated using quantitative PCR. The results indicate that non-polymorphic probes provide a robust approach for CNV identification, and the increasing precision of CNV boundary delineation should allow a more complete analysis of their genomic organization.
Highlights
DNA sequence diversity within the human genome may be more greatly affected by copy number variations (CNVs) than single nucleotide polymorphisms (SNPs)
With the completion of the human genome sequence, it is generally accepted that any two individuals are ~99.9% identical at the nucleotide level, and that the presence of single nucleotide polymorphisms (SNPs) in the genome are the major contributor to genetic diversity among humans [1]
The NspI Whole genome sampling analysis (WGSA) target interrogates ~250 K SNPs which in general each reside on a unique restriction fragment
Summary
DNA sequence diversity within the human genome may be more greatly affected by copy number variations (CNVs) than single nucleotide polymorphisms (SNPs). These changes can span a spectrum from, for example, an extra copy of an entire chromosome (trisomy 21) in Down's syndrome to sub-chromosomal deletions responsible for genetic traits such as color blindness and α and β thalassemias [4] This paradigm of genetic variation underwent a major revision in 2004 with the identification of genome-wide copy number variants that occur among phenotypically normal individuals [5,6]. A recent comparison of the genome sequence of an individual human with the NCBI human reference assembly suggested that DNA copy number variable regions contribute ~10 Mb to sequence heterogeneity [31] These results underlie the growing appreciation for and understanding of the need to account for CNVs in genome wide association studies. There is still an on-going need to develop molecular methods capable of direct and accurate detection of CNVs in order for this new class of polymorphisms to be effectively incorporated into genome wide LD mapping of genes involved in human disease [33]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.