Abstract

It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.

Highlights

  • Copy number variation (CNV) is a type of global genetic variations in human genome, defined as a segment of DNA larger than one kilobase presenting copy-number differences by comparison of two or more genomes [1,2,3,4]

  • The number of flanking SNPs is likely to account for the loss of taggability, because the ascertainment bias of both SNP discovery and array design exit between Asian and European populations, and we found that the average number (14.7) of flanking SNPs for high ‘taggability’ copy number polymorphisms (CNP) (r2. = 0.8) was marginally significant than that (11.7) of low taggability CNPs (r2,0.8). (P-value = 0.0509, two tails t-test), which indicates that further efforts should be made on array design if we want to fully understand CNV characteristics in Chinese populations

  • By using Affymetrix SNP 6.0, we have generated a comprehensive CNV map of Chinese by analyzing 155 healthy individuals coming from 7 ethnic groups

Read more

Summary

Introduction

Copy number variation (CNV) is a type of global genetic variations in human genome, defined as a segment of DNA larger than one kilobase presenting copy-number differences by comparison of two or more genomes [1,2,3,4]. Other than the discovery of Han Chinese samples in HapMap project [2,16,19,20], there are a certain number of studies reporting CNVs discovery in Han Chinese population [6,21,22,23,24]. None of these studies focused on the minority of Chinese populations.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call